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Zusammenfassung 


In dieser Arbeit entwickeln wir schnellere exakte Algorithmen (schneller be- 
züglich der Worst-Case-Laufzeit) für Spezialfälle von Graphproblemen. Diese 
Algorithmen beruhen größtenteils auf dynamischem Programmieren und auf 
2-SAT-Programmierung. Dynamisches Programmieren beschreibt den Vorgang, 
ein Problem rekursiv in Unterprobleme zu zerteilen, sodass diese Unterprobleme 
gemeinsame Unterunterprobleme haben. Wenn diese Unterprobleme optimal 
gelöst wurden, dann kombiniert das dynamische Programm diese Lösungen 
zu einer optimalen Lösung des Ursprungsproblems. 2-SAT-Programmierung 
bezeichnet den Prozess, ein Problem durch eine Menge von 2-SAT-Formeln 
(aussagenlogische Formeln in konjunktiver Normalform, wobei jede Klausel aus 
maximal zwei Literalen besteht) auszudrücken. Dabei müssen erfüllende Wahr- 
heitswertbelegungen für eine Teilmenge der 2-SAT-Formeln zu einer Lösung 
des Ursprungsproblems korrespondieren. Wenn eine 2-SAT-Formel erfüllbar ist, 
dann kann eine erfüllende Wahrheitswertbelegung in Linearzeit in der Länge 
der Formel berechnet werden. Wenn entsprechende 2-SAT-Formeln also in po- 
lynomieller Zeit in der Eingabegröße des Ursprungsproblems erstellt werden 
können, dann kann das Ursprungsproblem in polynomieller Zeit gelöst werden. 
Im folgenden beschreiben wir die Hauptresultate der Arbeit. 

Bei dem DIAMETER-Problem wird die größte Distanz zwischen zwei beliebigen 
Knoten in einem gegebenen ungerichteten Graphen gesucht. Das Ergebnis (der 
Durchmesser des Eingabegraphen) gehört zu den wichtigsten Parametern der 
Graphanalyse. In dieser Arbeit erzielen wir sowohl positive als auch negative 
Ergebnisse für DIAMETER. Wir konzentrieren uns dabei auf parametrisierte Al- 
gorithmen für Parameterkombinationen, die in vielen praktischen Anwendungen 
klein sind, und auf Parameter, die eine Distanz zur Trivialität messen. 

Bei dem Problem LENGTH-BOUNDED CUT geht es darum, ob es eine Kanten- 
menge begrenzter Größe in einem Eingabegraphen gibt, sodass das Entfernen 
dieser Kanten die Distanz zwischen zwei gegebenen Knoten auf ein gegebenes 
Minimum erhöht. Wir bestätigen in dieser Arbeit eine Vermutung aus der wis- 
senschaftlichen Literatur, dass LENGTH-BOUNDED CUT in polynomieller Zeit in 
der Eingabegröße auf Einheitsintervallgraphen (Intervallgraphen, in denen jedes 
Intervall die gleiche Länge hat) gelöst werden kann. Der Algorithmus basiert 
auf dynamischem Programmieren. 

k-DISJOINT SHORTEST PATHS beschreibt das Problem, knotendisjunkte Pfade 
zwischen k gegebenen Knotenpaaren zu suchen, sodass jeder der k Pfade ein 
kürzester Pfad zwischen den jeweiligen Endknoten ist. Wir beschreiben ein 


dynamisches Programm mit einer Laufzeit nO\*+ für dieses Problem, wobei n 
die Anzahl der Knoten im Eingabegraphen ist. Dies zeigt, dass k-DISJOINT 
SHORTEST PATHS in polynomieller Zeit für jedes konstante k gelöst werden kann, 
was für über 20 Jahre ein ungelöstes Problem der algorithmischen Graphentheorie 
war. 

Das Problem TREE CONTAINMENT fragt, ob ein gegebener phylogenetischer 
Baum T in einem gegebenen phylogenetischen Netzwerk N enthalten ist. Ein 
phylogenetisches Netzwerk (bzw. ein phylogenetischer Baum) ist ein gerichteter 
azyklischer Graph (bzw. ein gerichteter Baum) mit genau einer Quelle, in dem 
jeder Knoten höchstens eine ausgehende oder höchstens eine eingehende Kante 
hat und jedes Blatt eine Beschriftung trägt. Das Problem stammt aus der 
Bioinformatik aus dem Bereich der Suche nach dem Baums des Lebens (der 
Geschichte der Artenbildung). Wir führen eine neue Variante des Problems ein, 
die wir SOFT TREE CONTAINMENT nennen und die bestimmte Unsicherheits- 
faktoren berücksichtigt. Wir zeigen mit Hilfe von 2-SAT-Programmierung, dass 
SOFT TREE CONTAINMENT in polynomieller Zeit gelöst werden kann, wenn N 
ein phylogenetischer Baum ist, in dem jeweils maximal zwei Blätter die gleiche 
Beschriftung tragen. Wir ergänzen dieses Ergebnis mit dem Beweis, dass SOFT 
TREE CONTAINMENT NP-schwer ist, selbst wenn N auf phylogenetische Bäume 
beschränkt ist, in denen jeweils maximal drei Blätter die gleiche Beschriftung 
tragen. 

Abschließend betrachten wir das Problem REACHABLE OBJECT. Hierbei wird 
nach einer Sequenz von rationalen Tauschoperationen zwischen Agentinnen 
gesucht, sodass eine bestimmte Agentin ein bestimmtes Objekt erhält. Eine 
Tauschoperation ist rational, wenn beide an dem Tausch beteiligten Agentinnen 
ihr neues Objekt gegenüber dem jeweiligen alten Objekt bevorzugen. REACHA- 
BLE ÖBJECT ist eine Verallgemeinerung des bekannten und viel untersuchten 
Problems HOUSING MARKET. Hierbei sind die Agentinnen in einem Graphen 
angeordnet und nur benachbarte Agentinnen können Objekte miteinander tau- 
schen. Wir zeigen, dass REACHABLE OBJECT NP-schwer ist, selbst wenn jede 
Agentin maximal drei Objekte gegenüber ihrem Startobjekt bevorzugt und dass 
REACHABLE OBJECT polynomzeitlösbar ist, wenn jede Agentin maximal zwei 
Objekte gegenüber ihrem Startobjekt bevorzugt. Wir geben außerdem einen Po- 
lynomzeitalgorithmus für den Spezialfall an, in dem der Graph der Agentinnen 
ein Kreis ist. Dieser Polynomzeitalgorithmus basiert auf 2-SAT-Programmierung. 
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Abstract 


This thesis presents faster (in terms of worst-case running times) exact algorithms 
for special cases of graph problems through dynamic programming and 2-SAT 
programming. Dynamic programming describes the procedure of breaking down 
a problem recursively into overlapping subproblems, that is, subproblems with 
common subsubproblems. Given optimal solutions to these subproblems, the 
dynamic program then combines them into an optimal solution for the original 
problem. 2-SAT programming refers to the procedure of reducing a problem to 
a set of 2-SAT formulas, that is, Boolean formulas in conjunctive normal form 
in which each clause contains at most two literals. Computing whether such 
a formula is satisfiable (and computing a satisfying truth assignment, if one 
exists) takes linear time in the formula length. Hence, when satisfying truth 
assignments to some 2-SAT formulas correspond to a solution of the original 
problem and all formulas can be computed efficiently, that is, in polynomial 
time in the input size of the original problem, then the original problem can be 
solved in polynomial time. We next describe our main results. 

DIAMETER asks for the maximal distance between any two vertices in a 
given undirected graph. It is arguably among the most fundamental graph 
parameters. We provide both positive and negative parameterized results for 
distance-from-triviality-type parameters and parameter combinations that were 
observed to be small in real-world applications. 

In LENGTH-BOUNDED CUT, we search for a bounded-size set of edges that 
intersects all paths between two given vertices of at most some given length. 
We confirm a conjecture from the literature by providing a polynomial-time 
algorithm for proper interval graphs which is based on dynamic programming. 

k-DISJOINT SHORTEST PATHS is the problem of finding (vertex-)disjoint paths 
between given vertex terminals such that each of these paths is a shortest path 
between the respective terminals. Its complexity for constant k > 3 has been 
an open problem for over 20 years. Using dynamic programming, we show 
that k-DISJOINT SHORTEST PATHS can be solved in polynomial time for each 
constant k. 

The problem TREE CONTAINMENT asks whether a phylogenetic tree T is 
contained in a phylogenetic network N. A phylogenetic network (or tree) is a 
leaf-labeled single-source directed acyclic graph (or tree) in which each vertex 
has in-degree at most one or out-degree at most one. The problem stems from 
computational biology in the context of the tree of life (the history of speciation). 
We introduce a particular variant that resembles certain types of uncertainty in 


vil 


the input. We show that if each leaf label occurs at most twice in a phylogenetic 
tree N, then the problem can be solved in polynomial time and if labels can 
occur up to three times, then the problem becomes NP-hard. 

Lastly, REACHABLE OBJECT is the problem of deciding whether there is a 
sequence of rational trades of objects among agents such that a given agent 
can obtain a certain object. A rational trade is a swap of objects between two 
agents where both agents profit from the swap, that is, they receive objects 
they prefer over the objects they trade away. This problem can be seen as a 
natural generalization of the well-known and well-studied HOUSING MARKET 
problem where the agents are arranged in a graph and only neighboring agents 
can trade objects. We prove a dichotomy result that states that the problem 
is polynomial-time solvable if each agent prefers at most two objects over its 
initially held object and it is NP-hard if each agent prefers at most three objects 
over its initially held object. We also provide a polynomial-time 2-SAT program 
for the case where the graph of agents is a cycle. 
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Preface 


This thesis contains some of the results of my research at the Technische 
Universitat Berlin in the Algorithmics and Computational Complexity group 
headed by Prof. Rolf Niedermeier from January 2017 to September 2020. The 
presented findings are partially based on published papers and partially based 
on papers that are only available on the arXiv repository yet. Many of these 
results were prepared in close collaboration with my coauthors. These are (in 
alphabetical order) Jiehua Chen, Vincent Froese, Klaus Heeger, Dušan Knop, 
Josef Malik, André Nichterlein, Malte Renken, Mathias Weller, Gerhard J. 
Woeginger, and Philipp Zschoche. 

In the following, I sketch the story behind the research projects corresponding 
to the different chapters as well as briefly state my respective contributions. 


Chapter 3. After finishing my master’s thesis late 2016 in the young field of 
FPT in P and starting my PhD program in 2017, André Nichterlein (TU Berlin) 
suggested to further explore this field. He asked me to choose between either 
DIAMETER or MAXIMUM FLOW to work on next and I chose DIAMETER. Most 
of the results featured in our conference paper ([BN19]), which I presented at 
the 11° International Conference on Algorithms and Complexity (CIAC ’19) 
in Rome, Italy, are based on my ideas and André Nichterlein helped polishing 
both the results and the paper as a whole. An extended version featuring more 
details and all proofs is available in the arXiv repository and is submitted to a 
journal. 


Chapter 4. From September 2018 to September 2019 Dušan Knop (Czech 
Technical University in Prague) had a postdoctoral position in our group. He 
suggested to study the problem LENGTH-BOUNDED CUT. Initially, he was 
interested in certain W/1/-hardness results and started working on it with Klaus 
Heeger (TU Berlin). I joined the project soon after. During our research we 
found that the computational complexity of solving LENGTH-BOUNDED CUT on 
interval graphs and proper interval graphs was stated as an open problem in the 
literature. We showed that the problem is polynomial-time solvable on proper 
interval graphs and we also proved the W[1]-hardness results that we were 
initially looking for, that is, for the feedback vertex number and the combined 
parameter pathwidth plus maximum degree. The polynomial-time algorithm was 
mostly my contribution while the W[1]-hardness results are mostly due to Klaus 
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Heeger. The corresponding paper ([BHK20]) was presented by Klaus Heeger at 
the 31° International Symposium on Algorithms and Computation (ISAAC ’20), 
which was held virtually in December 2020. An extended version is available in 
the arXiv repository and is submitted to a journal. 


Chapter 5. Our group holds a research retreat each year. In September 2019 
at the retreat in Schloss Neuhausen (Brandenburg, Germany), André Nichterlein 
suggested to study a problem variant of DISJOINT PATHS and Anne-Sophie 
Himmel, Malte Renken, André Nichterlein, Philipp Zschoche (all TU Berlin), 
and I started working on it there. During the retreat, we studied different 
versions of DISJOINT PATHS and decided that we wanted to tackle the version 
DISJOINT SHORTEST PATHS after the retreat. It was known from the literature 
that this problem is NP-hard when the number k of shortest paths in the 
solution is part of the input and it was posed as an open problem for over 
twenty years whether there exists a polynomial-time algorithm for constant 
values of k. For k = 2 an O(n®)-time algorithm was known, where n is the 
number of vertices in the input graph. Between December 2019 and January 
2020, William Lochet (University of Bergen) and we independently answered 
the open question in the affirmative. William Lochet was the first to publish 
his paper at the 32”? Annual ACM-SIAM Symposium on Discrete Algorithms 


(SODA ’21) [Loc21]. While his algorithm has a running time of note), where 
the Landau notation hides a constant 9°° in the exponent, we have since worked 
on improving the running time of our algorithm to O(k - n16**'+*+1), We also 
proved W/1/-hardness for DISJOINT SHORTEST PATHS with respect to k. All 
coauthors except for Anne-Sophie Himmel, who left academia shortly after the 
retreat and has withdrawn her authorship of the corresponding paper, have 
worked roughly equally on all parts of the paper. I was less involved in the 
W/1/-hardness result and instead designed a dynamic program for DISJOINT 
SHORTEST PATHS on directed acyclic graphs which is used as a subroutine in our 
main algorithm. I presented the corresponding paper at the 48 International 
Colloquium on Automata, Languages, and Programming (ICALP ’21) [Ben+21). 
An extended version of the paper is available in the arXiv repository. 


Chapter 6. At the retreat in April 2017 near Boiensdorf (Mecklenburg- 
Vorpommern, Germany) Mathias Weller (University of Paris-Est) presented a 
problem called TREE CONTAINMENT that stems from computational biology. 
Josef Malik (Czech Technical University in Prague), Mathias Weller, and I 


started working on it. Unfortunately, we had only limited success during the 
retreat but Mathias Weller and I wanted to continue working on it after the 
retreat. Josef Malik also wanted to participate further but did not have the 
required time to do so. For this reason, most of the results were achieved 
in equal parts by Mathias Weller and me in close collaboration. I presented 
the corresponding paper ([BMW18}) at the 16%" Scandinavian Symposium and 
Workshops on Algorithm Theory (SWAT ’18) in June 2018 in Malmö, Sweden. 
An extended version is available in the HAL repository and is accepted for 
publication in the Journal of Graph Algorithms and Applications. 


Chapter 7. Rolf Niedermeier presented a paper on REACHABLE OBJECT at 
the retreat in Darlingerode (Saxony-Anhalt, Germany) in March 2018. Jiehua 
Chen (TU Vienna), Vincent Froese (TU Berlin), Gerhard J. Woeginger (RWTH 
Aachen University), and I chose this problem to work on during the retreat. 
We achieved a few hardness results as well as a polynomial-time algorithm 
for short preference lists of all agents in close collaboration during the retreat. 
However, there was an intriguing open problem left when the input graph is a 
path that was described in the literature to be “at the frontier of tractability, 
despite its simplicity”. Later this year, I resolved this case by providing a 
polynomial-time algorithm. A very similar algorithm was in the meantime 
developed independently by Sen Huang and Mingyu Xiao. We contacted the 
authors and invited them to join the two papers but they declined because of 
Chinese regulations. Their paper was presented at the 33” AAAI Conference on 
Artificial Intelligence (AAAI ’19) and is published in Autonomous Agents and 
Multi-Agent Systems [HX20]. We since improved our algorithm to also work for 
cycles but so far the paper is only available in the arXiv repository [Ben+19a]. 
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Chapter 1 


Introduction 


When confronted with a new problem, one of the first choices we face is to select 
a set of tools to tackle the problem with. Sometimes, none of the tools we know 
is useful for the task and we give up or we come up with a new (specialized) 
tool. Most often, however, (some of) the tools we already know are useful and 
our task becomes much easier once we figured out the correct tool for the job. 
So how do we choose the correct tool? Do we need to try every possible tool? 
Of course not. Is it up to experience to decide for the correct tool? While 
experience definitely helps, there are oftentimes rules or heuristics we can follow 
that guide us to the correct tool. Finding these rules and heuristics is important 
as it helps us and others to save time and effort not trying the wrong tools and 
not needing to collect years of experience before becoming efficient problem 
solvers. 

While the above holds in general, we want to focus on algorithmic problems 
and exact algorithms in this thesis. The tools available for such tasks are 
numerous. When considering computationally easy problems (problems in P), 
we often start with greedy algorithms but also tools like divide and conquer, 
dynamic programming, or modeling with a flow network come to mind. When 
considering computationally hard (NP-hard) problems, then we can use some of 
the previous tools or we can refer to tools like branch and bound, backtracking, 
integer linear programming (ILP), modeling as a SAT problem, color-coding, 
or data reduction. All of the mentioned tools are very well understood and we 
know at least some rules for each of them of when to apply them. Divide and 
conquer and dynamic programming for example are the first choices when a 
problem can be decomposed into smaller instances of the same problem. This is 
not to say, however, that we know everything about these tools already. In this 
thesis, we investigate two tools in more depth. These are dynamic programming 


and 2-SAT programming. 2-SAT programming is a tool that has been used in 
the literature much less than dynamic programming. For dynamic programming 
we will not find additional rules for when to use it but rather some rules of how 
to apply it. For 2-SAT programming, we will investigate where and how it was 
used so far, develop our own experiences by applying it to two problems, and 
conclude with two heuristics of when 2-SAT programming might be a good fit. 

One might ask why we chose exactly these two tools. On the one hand, this 
is to a certain degree up to pure chance. These tools just happened to work well 
for the problems we studied in the past. On the other hand, since we worked 
with these tools quite successfully, we feel confident that we can add something 
to the topic. 

In the following, we give an introduction to dynamic programming and 2-SAT 
programming. We conclude this chapter with an overview over the results in 
this thesis. 


1.1 Dynamic Programming 


Dynamic programming describes the procedure of recursively breaking down 
a problem into smaller overlapping subproblems and computing an optimal 
solution from solutions for these subproblems. Subproblems overlap if they 
have common subsubproblems. The analogous technique for non-overlapping 
subproblems is called divide and conquer. An example for divide and conquer is 
merge-sort, where in each step the array of numbers to sort is partitioned and 
independently sorted. 

The term dynamic programming was coined around 1952 by Richard Bell- 
man [Bel52]. Dynamic programming has since then become a staple of computer 
science which is taught in countless classes and books on algorithms, applied to 
computational problems such as LONGEST COMMON SUBSEQUENCE, LONGEST 
INCREASING SUBSEQUENCE, MAXIMUM WEIGHT INDEPENDENT SET on trees, 
and APPROXIMATE STRING MATCHING [Cor-+09, Ski20]. It has been used in 
numerous fields including machine learning [BNK20, BSW89], computer vi- 
sion [AWJ90], computational biology [Che+01, FT97, San00], and computational 
chemistry [Ari00, Gro+19]. It also had a large impact on parameterized algorith- 
mics as the go-to tool for algorithms on tree decompositions of graphs [Bod88, 
LZ20, Mar20]. 

In the following, we will first give a general structure of how to apply dynamic 
programming. We then exercise a standard example for dynamic programming 


using our general structure and finally describe how this structure will guide 
us throughout the first part of this thesis. Dynamic programming is most 
often achieved by filling a table where each entry stores the solution to some 
subproblem. There are four main questions that one should answer when 
developing a dynamic program: 


1. What does a table entry represent? 
2. What dimension shall the table have? 
3. How to compute the table entries? 


4. How can the solution of the original problem can be computed once the 
table is completely filled? 


We present a standard dynamic program for the problem KNAPSACK. For this 
problem, we are given a set X of n objects each with a positive integer weight 
(denoted by w) and a positive integer value (denoted by v) and two integers B 
and k. The question then is whether there is a subset of objects whose total 
weight is at most B and whose total value is at least k. KNAPSACK is known to 
be NP-complete but it allows for a pseudo-polynomial-time (polynomial if all 
number are encoded in unary) algorithm [TM90]. 

Without loss of generality, let X = {01,02,...,0n}. By answering the four 
questions above one by one, we explain how an existing O(n- B-k)-time dynamic 
program for KNAPSACK works [TM90]. 


1. What does a table entry represent? 


Each entry in the table T represents a subproblem which is defined by a subset 
of objects and two bounds 1 < B’ < B and 1 < k’ < k. The value in each entry 
in T is a binary value storing whether the respective subproblem is a yes- or a 
no-instance. 


2. What dimension shall the table have? 


Let X; := {01,02,...,0;} be the set of the first i objects. The table T has 
entries for each subset X;, where i € [n] and [n] := {1,2,...,n}. Moreover, 
it has an entry for each combination of a subset X; and values 1 < B’ < B 
and 1 < k’ < k. The dimension or type of T is therefore 


T: [n] x [B] x [k] — {true, false}. 


3. How to compute each table entry? 
Initially, we set 


T[L, BY, k’] = true, if B’ > w(o1) and k’ < v(0 1), and 

false, otherwise. 
Once all entries Ti, B’,k’] for a specific i are computed, we can compute 
an entry t := T[i+ 1, B’,k’]. We do so by distinguishing between the three 
cases B’ < w(o;41), B’ = w(o;41), and B’ > w(o0;41). If B’ < w(oj41), then 


t:= Ti, B’,k’). 
If B’ = w(o;+1), then 
en true, if k’ < v(oj41) and 
| Ti, B’,k’], otherwise. 


Finally, if B’ > w(o;}1), then 


ge true, if k’ < v(o;41) and 
"| Ti, BY, k/] V T[i, B’ — w(0;41),k' — v(o;,1)], otherwise. 


The idea is the following. If there is already a solution of total weight at most B’ 
and total value at least k’ using only the first i objects, then this solution is 
also a solution for the instance corresponding to T[i +1, B’,k’]. If no such 
solution exists, then the “new” object 0;,; has to be part of every solution. 
If B’ < w(o;+1), then no solution exists in this case. If B’ = w(o;+1), then 
there is a solution (the set {o;+1}) if and only if k’ < v(o;41). If B’ > w(oi41), 
then either the set {o;+1} is a solution (if k’ < v(o;+1)) or any solution set 5 
contains 0,4; and S” := S\{oj41} ZB (if k’ > v(oi+1)) such that S’ is a solution 
for the problem corresponding to Ti, B’ — w(0;41), k” — v(0;41)]. 


4. How can the solution of the original problem be computed once the table 
is completely filled? 


If each table entry is computed correctly, then the original instance is by 
definition a yes-instance if and only if T[n, B, k] = true. 


We skip the formal proof of correctness and the analysis of the running 
time [TM90]. 


The first part of this thesis is about dynamic programming. Therein, we study 
the problems DIAMETER, LENGTH-BOUNDED CUT, and k-DISJOINT SHORTEST 
PATHS. The DIAMETER problem asks for the longest shortest path between 
two vertices in a given graph. In LENGTH-BOUNDED CUT, we are given an 
undirected graph, two terminal vertices s and t, and two integers k and £. The 
question is whether there is a set of at most k edges such that removing those 
edges yields a graph in which the distance between s and t is larger than Z. For 
the problem k-DISJOINT SHORTEST PATHS, we are given an undirected graph 
and k terminal pairs (s;,t;) and the question is whether there are k disjoint 
paths P; such that P; is a shortest path between s; and ti. 


These problems are in some sense very similar as all of the problems deal 
with shortest paths in a given undirected graph but are also quite different from 
one another: One the one hand, DIAMETER is polynomial-time solvable while 
LENGTH-BOUNDED CUT and k-DISJOINT SHORTEST PATHS are NP-hard. On 
the other hand, LENGTH-BOUNDED CUT is about removing (cutting) parts 
from the graph while DIAMETER and k-DISJOINT SHORTEST PATHS are more 
about routing (finding specific shortest paths in a graph). As equal and yet 
different the problems are, so are the algorithms we develop for each of them. 
The algorithms have in common that they are dynamic programs but they 
differ in which of our four guiding questions is hardest to answer for them. 
These respective questions therefore deserve additional consideration and these 
considerations will guide us through the first part of the thesis. In Chapter 3, 
we will study DIAMETER and we will encounter a dynamic program in which 
the dimension of the table is quite unique as it partially depends on the optimal 
solution and can therefore not be determined a priori. In Chapter 4, we study 
LENGTH-BOUNDED CUT. The dynamic program we develop there does not 
allow to lookup the final answer in a specific table entry. Instead, the final 
answer is computed by iterating over a few specific table entries. Finally, in 
Chapter 5 we study k-DISJOINT SHORTEST PATHS and develop a dynamic 
program for it. The question of how to compute each table entry seems very 
easy to answer at first glance but it will turn out that considering it some more 
and ignoring some information given to us allows for a much faster algorithm. 


1.2 2-SAT Programming 


2-SAT programming! refers to the procedure of efficiently reducing a problem to 
a set of Boolean formulas in 2-CNF (conjunctive normal form with at most two 
literals per clause) such that the solution for the original problem can be con- 
structed from the solutions for the 2-SAT formulas (satisfying truth assignments 
or the fact that formulas are unsatisfiable). The technique has been used in a 
wide range of contexts, e. g. subgraph detection [HL00, Jan17], graph transforma- 
tion [HHW03], matrix partiotioning [Bul+16], computational biology [EHK03, 
GW09], resource allocation [HX20, MB20], and cartography [WW95]. However, 
we could not find many more examples of it being used and, to the best of our 
knowledge, 2-SAT programming has never been systematically analyzed as a 
general technique to solve computational problems. We start such an analysis by 
comparing how and when this technique was used in the literature so far. Before 
we do so, we first begin with an example of how to use 2-SAT programming. 

We show how to solve the following special case of INDEPENDENT SET in 
linear time in the input size. 


BOOLEAN MULTICOLORED INDEPENDENT SET 

Input: An undirected graph G := (V, E) where each vertex has a color and 

there are exactly two vertices of each color. 

Question: Is there a colorful independent set in G, that is, is there a set 

that contains exactly one vertex of each color and no two vertices in 
this set share an edge in G? 

Let G = (V, E) be an instance of BOOLEAN MULTICOLORED INDEPENDENT 
SET where u; and v; are the two vertices of the i*® color in G. Our constructed 
2-SAT program consists only of a single formula ® which contains a variable x; 
for each color. Setting x; to true corresponds to picking u; into the solution and 
setting x; to false corresponds to picking v; into the solution. The formula ® 
consist of one clause for each edge {y, z} in G that evaluates to false if both y 
and z are picked into the solution. Let y have the i*” color and let z have 
the jt! color and let i 4 j without loss of generality. We distinguish between 
the four possible cases (i) y = u; and z = uj, (ii) y = u; and z = vj, (iii) y= vi 
and z = uj, and (iv) y = v; and z = vj. In the first case, the clause shall 
evaluate to false if and only if x; = true and x; = true. This is achieved by the 
clause (2; A £j) = (72; V= xj). Analogously, the clauses for the other three 


We mention that this method of problem solving has been used only a few times in the 
literature before. The name 2-SAT programming is not established in the literature. 


U1 I U2 


® = ( 121 V £2) A ( ı £1 V £3) A 


v1 << 4 jo (a1 V= z2) A (a1 V £2) A 
5 2 6 (a1 V= z3) A (£2 V x3) 
DD wes 


Figure 1.1: An example instance of BOOLEAN MULTICOLORED INDEPENDENT SET 
and the 2-SAT formula ® constructed by our 2-SAT program. The encircled vertices 
form a solution and the edges are enumerated to allow easier verification of ®. The 
first edge corresponds to the first clause in ®, the second edge to the second clause 
in ®, and so on. Note that the encircled vertices correspond to the truth assign- 
ment xı = true, v2 = false, and x3 = true. It is easy to verify that this is a satisfying 
truth assignment to ®. The encircled solution indeed is the only solution for the given 
instance and the described truth assignment is the only satisfying truth assignment 
for ®. 


cases are (= z; V £j), (xi VO zj), and (x; V £j), respectively. An example of 
this construction is given in Figure 1.1. Note that the 2-SAT formula ® can be 
computed in time linear in the size of G. Since ® can be checked for a satisfying 
truth assignment in linear time (in the length of ® which is linear in the size 
of G) [APT79], the total running time is linear in the input size. It is easy 
to verify that ® is satisfied by some truth assignment if and only if the set of 
vertices corresponding to this truth assignment are pairwise non-adjacent in G. 
Since each such set contains exactly one vertex of each color, each solution of ® 
corresponds to a solution of BOOLEAN MULTICOLORED INDEPENDENT SET. 
We conclude this introduction to 2-SAT programming with an analysis of 
how and when 2-SAT programming was used in the literature before and how 
our two new results fit into this picture. The majority of results that we could 
find that used 2-SAT programming ([Bul+16, EHK03, GW09, HHW03, HLOO, 
WW95]) used it as follows. Variables describe whether or not to pick some 
element into a solution set and the clauses in each 2-SAT formula prevented 
that some conflicting elements where chosen in the same solution. In the 
remaining examples ([HX20, Jan17, MB20]) variables did not represent whether 
or not to pick some element into a solution but rather which element is picked 
into a solution (as in our example above). By exploring 2-SAT programming 
more in-depth in the thesis, we hope to find some indications of when 2-SAT 
programming should be considered for new (algorithmic) problems. Indeed, 


we will conclude that 2-SAT programming is promising when the considered 
problem is (thought to be) polynomial-time solvable and has some independence 
structure. By that, we mean that the a solution consists of some elements that 
can 


e be partitioned into constant-size parts and at most one element from each 
part is picked into the solution, and 


e a set of elements forms a solution if each pair of elements in this set can 
be contained in the same solution. 


In the example above, the elements were the vertices of the graph, the partition 
was achieved by the colors of vertices, and a set of vertices forms a solution only 
if they are pairwise non-adjacent. 

We present two new examples of 2-SAT programming in the second part of 
the thesis. These examples follow the distinction of 2-SAT programs stated 
above. In Chapter 6, we study a problem from computational biology called 
TREE CONTAINMENT that asks whether a specific subtree exists in a given 
directed graph. Roughly speaking, our approach is to introduce a variable for 
each vertex in the input network that is set to true if the vertex belongs to the 
sought subtree and to false otherwise. In Chapter 7, we investigate REACHABLE 
OBJECT, a problem stemming from the field of resource allocation. Therein, 
agents initially own objects, have different preferences over the objects, and are 
arranged in a social network. They may swap objects with one another under 
certain conditions including that they must be adjacent in the social network. 
The question is then whether a specific agent can obtain a given target object. 
In a cycle, each object is given from the agent that initially holds it to one of 
its two possible neighbors. The variables in the 2-SAT program we develop in 
Chapter 7 then represent for each object to which of the two respective agents 
it is swapped to. 

We conclude this introduction to 2-SAT programming with describing a 
similarity and a dissimilarity between the two 2-SAT programs we develop in the 
thesis. They have in common that they are designed for very sparse graphs (trees 
in Chapter 6 and cycles in Chapter 7) and they differ in how they generalize 
to “k-SAT programs”. While the algorithm for TREE CONTAINMENT does 
generalize naturally to any constant k to a correct algorithm for a meaningful 
problem, the same cannot be said about the algorithm for REACHABLE OBJECT. 


1.3 Results 


In this thesis, we design and analyze algorithms for (mostly NP-hard) graph 
problems. We achieve a wide range of different results, among others, parame- 
terized hardness and algorithms, polynomial-time algorithms for special cases, 
and results within FPT in P, that is, parameterized algorithms and hardness 
results for problems in P [GMN17]. We also resolve some open problems from 
the literature. 

In Chapter 3, we study the DIAMETER problem, which asks for the longest 
shortest path between any two vertices in a given graph. This parameter was 
observed to be very small in many different real-world application [LH08, Mil67, 
New03] and it is often used in network analysis [AJB99, WF94]. This has led 
to a wide spectrum of algorithms computing the diameter faster than the naive 
algorithm (see Zwick [Zwi01]). We add to this spectrum by providing new 
parameterized algorithms for computing the diameter. On the one hand, we 
study distance-from-triviality-like parameters [GHN04] and show that graphs 
with small modulators to cographs, that is, small sets of vertices whose removal 
yield a cograph, allow for faster diameter computations while graphs with small 
modulators to bipartite graphs do not. On the other hand, we study parameter 
combinations that are expected to be small in real-world applications. Here, we 
show that the combined parameter h-index plus diameter allows for positive FPT- 
in-P results whilst similar combinations under standard complexity assumptions 
do not. The algorithms for graphs with small modulators to cographs and 
for the combined parameter h-index plus diameter are both based on dynamic 
programming. 

In Chapter 4, we study the problem LENGTH-BOUNDED CUT which arises 
from the field of network flows. Given an undirected graph, two terminal 
vertices s and t, and two integers k and £, the question is whether there 
is a set of at most k edges such that removing these edges yields a graph 
in which the distance between s and t is larger than £. We prove a conjec- 
ture by Bazgan et al. [Baz+19] by providing a polynomial-time algorithm for 
LENGTH-BOUNDED CUT on proper interval graphs which is based on dynamic 
programming. We also briefly investigate interval graphs and show limitations 
of our approach for proper interval graphs. 

In Chapter 5, we look at a long-standing open question regarding the com- 
plexity of k-DISJOINT SHORTEST PATHS for constant k [Eil98, Fom+19]. In 
k-DISJOINT SHORTEST PATHS, we are given an undirected graph and k terminal 
pairs (s;,t;), and the question is whether there are k disjoint paths P; such 


that P; is a shortest path between s; and t;. We present an algorithm whose 
running time is polynomial in the input size for each constant k. The algorithm 
is based on dynamic programming and a geometric representation of the problem 
that is quite intuitive yet, to the best of our knowledge, novel. 

In Chapter 6, we investigate a problem variant of TREE CONTAINMENT which 
stems from computational biology. Given a leaf-labeled directed acyclic graph N 
(called a phylogenetic network) and a leaf-labeled directed tree T, the question 
in TREE CONTAINMENT is whether N displays T. This is the case if N contains 
a subdivision of T as a subgraph that respects leaf-labels [ISS10]. A version 
of TREE CONTAINMENT where N is a tree is used in the quest for finding the 
“tree of life”, that is, given the current knowledge of speciation (modeled as a 
directed tree N) and some new data (modeled as another (possibly smaller) 
directed tree T), the question is whether N and T are consistent. We call 
the version we investigate SOFT TREE CONTAINMENT. It is motivated by soft 
polytomies, that is, multiple speciation events whose order is unknown. Another 
kind of uncertainty can be modeled by allowing N to have multiple leaves with 
the same label. Our main contribution is a dichotomy result regarding the 
maximal number of occurrences of a label in N. On the one hand, using 2-SAT 
programming, we show that SOFT TREE CONTAINMENT is polynomial-time 
solvable if N is a tree in which each leaf-label occurs at most twice. On the 
other hand, we show that Sorr TREE CONTAINMENT remains NP-hard when 
restricted to trees in which each leaf-label occurs at most thrice. 

In Chapter 7, we study a problem called REACHABLE OBJECT. Therein, 
one is given a set of agents, a set of objects, a specific agent J, and a specific 
object x. Each agent has strict preferences over the objects and initially owns 
exactly one object. Additionally, the agents are arranged in a graph (social 
network) representing which agents know each other. The question is then 
whether there is a sequence of rational swaps such that agent I owns object x 
in the end [GLW17]. A rational swap is a trade between two agents that know 
each other such that both agents receive an object they prefer over the object 
they give away. Our contribution is twofold. First, we present a dichotomy 
result regarding the number of objects each agent prefers over its initially 
held object. If each agent prefers at most two objects over the one it initially 
holds, then REACHABLE OBJECT can be solved in polynomial using dynamic 
programming. The problem remains NP-hard even if each agent prefers at most 
three objects over its initially held object. Second, using 2-SAT programming, 
we provide a polynomial-time algorithm for REACHABLE OBJECT on cycles 
which is a generalization of a previous algorithm for REACHABLE OBJECT on 
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paths [HX20]. The original algorithm for paths answered an open problem from 
the literature [GLW17]. 

Finally, we summarize our main results and give a broader overview over 
possible avenues for further research regarding dynamic programming and 2-SAT 
programming in Chapter 8. 
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Chapter 2 


Preliminaries 


In this chapter, we describe our notation and some general tools that will be 
used in the following chapters. If a specific notion is only used in a single 
chapter, then it will be introduced there. We assume familiarity with the basics 
of set theory, calculus, and the description and analysis of algorithms. 


2.1 Number Theory 


We denote by Z := {...,—1,0,1,...} the set of all integers, by N := {0,1,2,...,} 
the set of all non-negative integers, and by N+ := N \ {0} the set of all positive 
integers. We use Q := {2/4 | p € Z Aq € N+} to denote the set of all rational 
numbers and Q := {2/4 | p,q € NF} to denote the set of all positive rational 
numbers. The set of all real numbers is denoted by R. 

For two integers a,b € Z, we denote by [a,b] the integer interval between a 
and b, that is, [a,b] = {i € Z| a < i < b}. Analogously, we denote the 
rational interval between a and b by [a,b = {ic Q| a <i< bd}. Fora>b, 
let [a,b] := [a,b] := Ø. Finally, for a positive integer £ € N+ we use [¢] as an 
abbreviation for [1,0] = {1,2,..., 4}. 


2.2 Graph Theory 


An undirected graph G is a tuple (V, E) where V is the set of vertices or nodes 
and EC 2) is the set of edges. A directed graph is a tuple (V, A) where V is 
again the set of vertices or nodes and A C {(u,v) | u £vAu,v € V} is the set of 
arcs. We will use n := |V| to denote the number of vertices, m := |E| (m := | A|) 
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to denote the number of edges or arcs, and |G| := n + m to denote the size of G. 
All graphs in this thesis are undirected unless explicitly stated otherwise. 

For a vertex subset V’ C V, we denote by G[V’] the graph induced by V’, 
that is, the graph G[V’] := (V’,E’ := {{u,v} € E | u,v € V’}) if G is an 
undirected graph and G[V"] := (V’,A’ = {(u,v) € A | u,v € VS) if G 
is directed. We abbreviate G — V’ := G[V \ V]. A path P := (vo,..., ve) 
in a directed graph is a graph with a set V(P) := {vo,...,ve} of vertices 
and arc set A(P) = {(v;,vi41) |O<i< L}. A path P := (vo,...,vr) in an 
undirected graph is a graph with a set V(P) := {vo,...,ve} of vertices and a 
set E(P) = {{v;, visi} | 0 < i < £} of edges. We say that £ is the length of P 
and a shortest path between two vertices is a path of minimum length. We 
define A(P) to be the set of arcs {(v;_1,v;) | i € [€]} and A-!(P) to be the 
set of arcs {(v;, vi-1) | i € [Q}. Intuitively, A(P) and A~!(P) describe the two 
directed versions of P in an undirected graph. The vertices vo and v¢ are called 
the end vertices or ends of P and are denoted by start(P) and end(P). We also 
say that P is a path from vo to ve, a path between vo and ve, or a vo-ve-path. 
When no ambiguity arises, we do not distinguish between a path and its set of 
vertices. We identify specific paths by just some of their vertices, e. g. we use 
the name a-b-c-path to denote a path that starts in a, then continues by some 
shortest a-b-path, and ends with some shortest b-c-path. 

Let v,w be two vertices in a path P. We denote by P[v,w] the sub- 
path of P with end vertices v and w. For two paths Pı := (vo,...,va) 
and Pz := (vg,...,u,) with vj = va or {va,vo} € E ((va, vo) € A), we de- 
fine Py è Po := (v0,..., Ua, U{,-+-, U4) or Py @ Pa = (U0,..., Ua, Ub,- +5 U6), LE 
spectively. For two vertices u,v € V, we denote with distg(u,v) the distance 
between u and v in G, that is, the number of edges in a shortest path between u 
and v. If G is clear from the context, then we omit the subscript. A connected 
component C C V in a graph G is a maximal set of vertices such that there is a 
path between each pair of vertices in ©. 

The degree dega(v) of a vertex v € V in an undirected graph G is the number 
of edges that contain v. The in-degree of a vertex v € V in a directed graph is 
the number of arcs of the form (u,v) for u € V. A vertex with in-degree zero 
is called a source. The out-degree of a vertex v € V in a directed graph is the 
number of arcs of the form (v,w) for w € V. A vertex with out-degree zero 
is called a sink. The degree of a vertex v € V in a directed graph is the sum 
of its in-degree and its out-degree. The neighborhood Ng(v) of a vertex is the 
set of all vertices that share an edge (or arc) with v in G and we use Nog|[v] 
to denote N(v) U {v}. Again, if G is clear from the context, then we omit the 
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subscript. Suppressing a degree-two vertex v € V in an undirected graph G 
refers to the action of removing the vertex v from G and adding the edge 
between v’s two neighbors u, w if it is not already contained in G. Suppressing 
a vertex v € V in a directed graph G = (V, A) with in-degree one and out- 
degree one refers to the action of removing the vertex v from G and adding the 
arc (u, w) where (u,v),(v,w) € A. Again, if this arc was already present in A, 
then we just remove v with its two incident arcs. Subdividing an edge {u, w} 
in an undirected graph refers to the action of removing {u, w} and adding a 
new vertex v and new edges {u,v} and {v, w}. Subdividing an arc (u, w) ina 
directed graph refers to the action of removing the respective arc and adding a 
new vertex v and new arcs (u,v) and (v, w). 

We continue with some notation for directed acyclic graphs (DAGs). We call 
a vertex d in a DAG Ga descendant of another vertex a if there is a a-d-path 
in G. Moreover, we call a an ascendant of d. For a DAG G, let <g bea 
relation between vertices in G such that v <c u if and only if u is an ancestor 
of v. Moreover, let u <g v if and only if u <g v or u = v. Let define G, 
to be the subgraph of G induced by {u | u <a v}. The set of least common 
ancestors LCAn({X}) of a set X of vertices contains all minima with respect 
to <g among all vertices u of N with v <g u for all v € X. In particular, 
if G is a tree, then LCA ({X}) contains a single vertex. If G is clear from the 
context, then we may drop the subscript. 

Two undirected graphs G := (Vg, Eg) and H := (Vy, Ey) are isomorphic if 
there is a bijection f between Vg and Vy such that for any two vertices u,v € Va 
it holds that {u,v} € Eq if and only if { f(u), f(v)} € Ep. Analogously, two 
directed graphs G := (Vg, Aq) and H := (Vy, Ay) are isomorphic if there is 
a bijection f between Vg and Vy such that for any two vertices u,v € Va it 
holds that (u,v) € Ac if and only if (f(u), f(v)) € En. We call f the mapping 
between G and H. 


2.2.1 Graph Classes 


A tree is a connected acyclic (directed or undirected) graph, that is, a graph in 
which each pair of vertices is connected by an unique shortest path. A rooted 
tree is a tree T with a designated vertex r called the root of T. The depth of 
a vertex v in a rooted tree is the distance between v and r. The height of a 
vertex v in a rooted tree is the maximum distance between v and a leaf lin T 
such that v is contained in a shortest r-¢-path. A forest is a graph in which 
each connected component is a tree. 
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Figure 2.1: An example 
b e— ce d of an interval graph (left 


e oo side) and its interval rep- 
resentation (right side). 


A 


Figure 2.2: An example of a generalized caterpillar with hair length two. The topmost 
vertices form the central path and the paths below are the hairs. 


A clique is a graph G = (V, E) with E = {{u,v} |u,ve V}. A graph is 
bipartite if its vertex set can be partitioned in two sets Vi, Va such that for 
each edge {u,v} € E it holds that u € Vi and v € V2 (or u € Və and v € Vj). 
Analogously, a graph is k-partite if its vertex set can be partitioned into k 
sets V1, V2,..., Vp such that it holds for each edge {u,v} € E that u and v are 
not contained in the same vertex set V;. 


An interval graph is a graph G = (V, E) such that each vertex v can be 
represented by a rational interval [b,, f,]& such that two vertices u, w are adjacent 
in G if and only if [bu, ful? A [bw, ful #0. A proper interval graph is an interval 
graph such that there are no two vertices v and w such that [by, fu]? C bu, ful®.- 
Equivalently, a proper interval graph can be defined as an interval graph where 
each interval has length one, i. e., b, +1 = f, for each vertex v (see e. g. [BLS99]). 
An example of an interval graph and its interval representation is given in 
Figure 2.1. A cograph is a graph that does not contain a Pı (a path of four 
vertices and three edges) as an induced subgraph. A caterpillar is a tree such 
that removing all leaves yields a path (i.e, all vertices are within distance at 
most one of a “central path”). A generalized caterpillar with hairs of length at 
most h > 1 is a tree such that removing paths of length at most h yields a path. 
A generalized caterpillar with hairs of length two is shown in Figure 2.2. 
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2.2.2 Graph Parameters 


The maximum degree of a graph G = (V,E) is the maximum number of 
incident edges to any single vertex in the graph, that is, max{deg(v) | v € V}. 
Analogously, the minimum degree is defined as min{deg(v) | v € V} and the 
average degree of a graph is 2™/n. We denote by d(G) the diameter of G, that 
is, the length of the longest shortest path in G. The h-index of a graph G is the 
maximum number A such that the graph contains at least h vertices of degree 
at least h. 

One of the most famous graph parameters is the treewidth. It is defined 
through tree decompositions. A tree decomposition of a graph G = (V, E) is a 
tree T = {X, E’}, where each X; € X = {X1, Xo,..., Xe} is a subset of V and 
the following three properties hold. First, each vertex v € V is contained in at 
least one X; € X. Second, for each vertex v € V, the set of all X; with v € X, 
induces a connected subgraph in T. Third, for every edge {u,v} € E, there is 
a subset X; that contains both u and v. The width of a tree decomposition 
is max{|X| | X € X}—1 and the treewidth of G is the minimum width among all 
possible tree decompositions of G. The pathwidth of a graph is defined similarly 
to its treewidth, but instead of tree decompositions only path decompositions 
are considered, that is, the tree T is required to be a path. 

The girth of a graph is the size of a smallest induced cycle in the graph (or oo 
if the graph is a forest). The bisection width of a graph G = (V, E) is defined as 
the size of a smallest set E’ of edges such that V can be partitioned into two 
sets Vi, Vo with |Vi| = |V2| (or |Vi| = |V2| + 1 if |V| is odd) such that all edges 
with one end in Vı and one end in V2 is contained in E’. Bisection width is 
illustrated in Figure 2.3. A dominating set in a graph is a set K of vertices such 
that each vertex in the graph is contained in K or has at least one neighbor 
in K. The domination number of a graph is the size of a minimum dominating 
set in it. The acyclic chromatic number of a graph is the minimum number of 
colors needed to color each vertex with one of the given colors such that each 
subgraph induced by all vertices of one color is an independent set and each 
subgraph induced by all vertices of two colors is acyclic. 

Lastly, for some graph class II, the distance to II is the size of a minimum 
set of vertices such that the graph resulting from deleting this set of vertices is 
in II. In this thesis, we will consider the distance to cographs, the distance to 
bipartite graphs, and the distance to forests. The distance to bipartite graphs is 
known as odd cycle transversal number and the distance to forests is known as 
feedback vertex number in the literature. We will hence use these names. The 
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Vi V2 


Figure 2.3: An example of the bisection width of a graph. The edges between the 
two parts are drawn using dashed lines and the bisection width is two (the number of 
dashed edges). 


edge-deletion distance to forests, that is, the size of a smallest set of edges such 
that removing them yields a forest, is known as the feedback edge number. 


2.3 Complexity Classes and Hypotheses 


We assume familiarity with the basics of Turing machines and Random Access 
Machines. Otherwise, we refer to Papadimitriou [Pap94]. In this thesis, we will 
always analyze the running time of an algorithm in terms of Random Access 
Machines. However, complexity classes are classically defined using Turing 
machines. 

The class P contains all decision problems (or languages) that can be decided 
in polynomial time by deterministic Turing machines. The class NP contains all 
decision problems that can be decided in polynomial time by non-deterministic 
Turing machines. 

A parameterization for a problem L is formally a pair of functions (f, g) such 
that f maps each possible input I for P to some object f(T) and g maps each 
such object to a non-negative integer. We use the treewidth of a graph as an 
example. Here, f maps each graph to a tree decomposition of G and g measures 
the width of the tree decomposition, that is, the maximum number of vertices in 
any bag of the tree decomposition (minus one). A parameter is then the resulting 
positive integer g(f(Z)) of a parameterization. A parameterized problem is a 
tuple (L, x), where L is a language (an unparameterized decision problem) and x 
is a parameter. An instance of (L,«) is a pair (z,k) where k = g(f(a)) for 
some parameterization (f,g). For a parameterized problem £ = (L,x), the 
language £ = {x € D* | Ik: (x, k) € L} is called the unparameterized problem 
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associated to £. For a broader introduction into parameterized complexity 
theory, we refer the reader to the books by Cygan et al. [Cyg+15], Downey and 
Fellows [DF 13], Flum and Grohe [FG06], and Niedermeier [Nie06]. 

A problem L is fixed-parameter tractable with respect to some parameter « if 
there is an algorithm deciding whether (x, k) € (L,«) (or equivalently x € L) 
in f(k)-|x|°™ time, where |z| denotes the size of x and f is some computable 
function depending only on k. The class FPT contains all parameterized 
problems (L,«) where L is fixed-parameter tractable with respect to x. The 
class XP contains all parameterized problems (Z,«) such that there is an 
algorithm deciding whether (x,k) € (L,«) in |a|/ time, where f is again 
some computable function only depending on k. The class W/1] contains all 
parameterized problems (L,«), where every instance (x,k) of (Z,«) can be 
transformed in f(k) -|2|°“ time to a combinatorial circuit that has weft at 
most one and constant depth for all instances, such that (x, k) € (L,«) if and 
only if there is a satisfying truth assignment to the input circuit that assigns true 
to exactly k inputs. The weft of a combinatorial circuit is the largest number of 
logical units with unbounded fan-in on any path from an input to the output. 
The depth of a combinatorial circuit is the largest number of logical units on any 
path from an input to the output. Similarly to the assumption that P 4 NP, the 
assumption FPT # W/1/ is widely believed and is used to exclude FPT-results. 
A few years ago, the topic of FPT in P [GMN17] emerged from parameterized 
complexity theory. Therein, instead of designing f(k)-|x]9)-time algorithms for 
NP-hard problems where k is some superpolynomial function, one is interested 
in f(k)-|a|°-time algorithms for problems in P, where no O(|2|°)-time algorithm 
is known for the unparameterized problem associated to it. 

The problem k-SAT is a generalization of 2-SAT and defined as follows. 


k-SAT 

Input: A Boolean formula ® in conjunctive normal form where each clause 
in ® contains at most k literals. 

Question: Is ® satisfiable? 


Analogously to 2-SAT programs, we use the term k-SAT program to refer to 
an algorithm that solves a problem by constructing and solving k-SAT instances 
(formulas conjunctive normal form where each clause contains at most k literals). 

The Exponential-Time Hypothesis (ETH) of Impagliazzo and Paturi [IP01] 
postulates that there is no 2°-time algorithm solving the SATISFIABILITY 
problem, where m is the number of clauses. It is formalized as follows. 
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Hypothesis 2.1 (Exponential-Time Hypothesis (ETH)). There is some con- 
stant 6 > 0 such that 3-SAT cannot be solved in O(2°") time, where n is the 
number of variables in the input formula. 


It is worth noting that assuming the ETH, there is no f(k) - n°“)-time 
algorithm solving MULTICOLORED CLIQUE problem [Che+05], where f is a 
computable function and k is the solution size. 


MULTICOLORED CLIQUE 

Input: An integer k and a k-partite undirected graph G := (V,E) with 
V =We_, V; and |V;| = ”/k for all i € [k]. 

Question: Is there an induced clique of size at least k in G? 


A stronger version of the ETH is the so-called Strong Exponential-Time 
Hypothesis (SETH) [IP01]. It states the following. 


Hypothesis 2.2 (Strong Exponential-Time Hypothesis (SETH)). For each 6 < 1 
there is an integer k such that k-SAT cannot be solved in O(2°") time. 


Let ® be a Boolean input formula for SATISFIABILITY. We remark that 
if the SETH is true, then there is no |®|?~© - |}6|°“)-time algorithm solving 
SATISFIABILITY [IP01]. 


2.4 Reductions Between Problems 


A (many-one) reduction is a function R: 4* — 2* that transforms an in- 
stance x of some problem L to an equivalent instance y of a problem L’, that 
is, y E L! = > we L. A polynomial-time (many-one) reduction is a reduction 
that can be computed in time polynomial in the input size |z|. To show that a 
problem A is presumably not in P, one can reduce an NP-hard problem B to A 
(written as B <? A). Unless P = NP, this shows that A ¢ P. A problem B 
is NP-hard if for all problems in NP there is a polynomial-time reduction 
to B. Famous examples of NP-hard problems are e. g. MULTICOLORED CLIQUE 
and k-SAT [Kar75]. If B <P A for a NP-hard problem B, then A is also 
NP-hard. 

A parameterized reduction is a reduction R: D»* x IN > %* x N that transforms 
a parameterized problem £ to a parameterized problem £’ in FPT-time, that 
is, for each instance (a, k) of £ it produces an instance (y,£) of £’ such that 


1. (y, £) can be computed in f(k)-|a|?™ time for some computable function f, 
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2. (y, 4) E L <=> (z,k) € L’, and 
3. L < g(k) for some computable function g. 


To show that some parameterized problem is presumably not in FPT, one 
regularly uses the standard complexity assumption that FPT 4 W/1] and shows 
that a problem is W/1/-hard. To show W/1/-hardness for some parameterized 
problem £, we use parameterized reductions from W/1/-hard problems similar to 
the unparameterized setting. Probably the most famous example of a W/1/-hard 
problem is MULTICOLORED CLIQUE parameterized by the solution size k. 

Concerning F PT-in-P studies, we use the notion of General-Problem-hardness 
which formalizes the types of reduction that allow us to exclude certain parame- 
terized algorithms for problems in P. In a nutshell, we want to upper-bound the 
parameter in the constructed instance by some constant £ without increasing 
the running time or the instance size by too much. Since it holds for each com- 
putable function f that f(£) is some constant, we can then hide any dependency 
on £ in the Landau notation. 


Definition 2.1 ([Ben+19b, Definition 3.1]). Let £ C U* x IN be a parameterized 
problem, let Ê C D* be the unparameterized decision problem associated to £, 
and let g: IN — N be a polynomial. We call £ ¢-General-Problem-hard(g) 
(¢-GP-hard(g) ) if there exists an algorithm A transforming any input instance x 
of Ê into a new instance (y, k) of £ such that 


1. A runs in O(g(|z|)) time, 
2. (y,k) EL > rel, 
3. k< 2, and 

4. |y| € O(|z\). 


We call L General-Problem-hard(g) (GP-hard(g)) if there exists an integer £ 
such that £ is £-GP-hard(g). We omit the running time and call £ ¢-General- 
Problem-hard (¢-GP-hard) if g is a linear function. 


Showing GP-hardness for some parameter « allows to lift algorithms for the 
parameterized problem to the unparameterized setting as stated next. The idea 
behind this statement is that assuming a parameterized problem is both (-GP- 
hard n° and can be solved in O(n°- f(k)) time for some computable function f 
and some constant c, then we can solve the unparameterized problem associated 
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to it in O(n°) time by first reducing an arbitrary input instance to an equivalent 
instance in which the parameter is at most ( and then use the parameterized 
algorithm where we can hide the dependency on the parameter in the Landau 
notation. 


Lemma 2.3 ([Ben+19b, Lemma 3.2]). Let g: N —> N be a polynomial, and 
let L C &* x N be a parameterized problem which is GP-hard(g). Let LEN* be 
the unparameterized decision problem associated to L. If there is an algorithm 
solving each instance (y,k) of L in f(k)-g(|y|) time, then there is an algorithm 
solving each instance x of È in O(g(|x|)) time. 


We conclude this chapter with a simple example to illustrate Lemma 2.3. 
Consider the problem of detecting whether a given undirected graph contains 
a clique of size five and the parameter bisection width. By simply copying the 
graph such that the resulting graph has twice as many vertices and edges, the 
bisection width becomes zero as there is no edge between the two copies of the 
original graph and both copies contain the same number of vertices. Moreover, 
the resulting graph contains a clique of size five if and only if the original graph 
contains a clique of size five. Now assume that there was an O((n + m) - f(k))- 
time algorithm for this problem where k is the bisection width and f is some 
computable function. Then, this would imply that we could first construct an 
equivalent instance in linear time where k = 0 as described above, and then 
solve this equivalent problem in f(k) (n +m) € O(n + m) time. 
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Part I 


Dynamic Programming 


23 


Chapter 3 


Diameter 


In this chapter, we study the problem DIAMETER which asks for the maximum 
distance between any two vertices in a given undirected graph. Regarding 
dynamic programming, this chapter features a dynamic program that is note- 
worthy as the solution for DIAMETER will not be related to any table entry but 
to the final size of the table. To the best of our knowledge, this is the first time 
the dimension of a dynamic program was used in this way. 

Concerning the DIAMETER problem, many consider the diameter of a graph 
among the most fundamental graph parameters [Bac+18, New03, WF94]. Most 
known algorithms for determining the diameter first compute the shortest path 
between each pair of vertices (ALL-PAIRS SHORTEST PATHS) and then return 
the maximum [AVW16]. However, several more efficient algorithms have been 
proposed for special cases [AVW16, BHM20, Cor+01, FP80, Gaw+21] or for 
approximating the diameter [Ain+99, Bac+18, RW13, WY16]. 

In this chapter, we follow the FPT-in-P approach [AVW16, Ben+20, GMN17], 
that is, we propose parameterized algorithms for DIAMETER that run faster than 
known unparameterized algorithms when specific parameters are very small or 
show that such algorithms refute popular complexity assumptions. In Section 3.2, 
we follow the distance-from-triviality-parameterization paradigm [GHNO4] aiming 
to augment a folklore algorithm for DIAMETER on cographs such that it also 
works for graphs with small modulators to cographs, that is, graphs with small 
sets of vertices whose removal yields a cograph. We also analyze graphs with 
small modulators to bipartite graphs. For the parameter distance k to cographs, 
we provide a 2°") (n+m)-time algorithm. For the parameter odd cycle transversal 
number k (the distance to bipartite graphs), we use our recently introduced notion 
of General-Problem-hardness [Ben+19b] to show that DIAMETER parameterized 
by k is “as hard” as the unparameterized DIAMETER problem. In Section 3.3, 
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we investigate parameter combinations that are motivated by properties of 
social networks. Social networks often have special characteristics, including the 
small-world property (small diameter) and a power-law degree distribution (small 
average degree and small h-index) [LH08, Mil67, New03, New10, NP03]. Since 
social networks often have small diameter and small h-index, we investigate 
combinations of parameters closely related to the diameter and parameters 
closely related to the h-index. 

The domination number d is a parameter that upper-bounds the diameter and 
the acyclic chromatic number a upper-bounds the average degree and is upper- 
bounded by the h-index. Hence, the standard O(n-m)-time algorithm runs 
in O(n? - a) time. We will show that this is essentially the best one can hope for 
as, assuming the SETH, we can exclude f(a, d) (n+ m)?"*-time algorithms for 
each £ > 0. Our result is based on a reduction by Roditty and Williams [RW13] 
which is modified such that the acyclic chromatic number and the domination 
number in the resulting graph are five and four, respectively. It is known that 
a kOU)(n + m)?-*-time algorithm where k is the combined parameter diameter 
plus maximum degree would refute the SETH [BN19]. Complementing this lower 
bound, we provide an f(k)(n + m)-time algorithm where k is the combined 
parameter diameter plus h-index. The maximum degree upper-bounds the h-index. 


3.1 Problem Definition and Related Work 


DIAMETER asks for the maximum distance between any two vertices in a given 
undirected and connected input graph. It is formally defined as follows and 
an example is given in Figure 3.1. Recall that distg(v,w) is the length of a 
shortest path between v and w in G. 


DIAMETER 

Input: An undirected and connected graph G := (V, E). 

Task: Compute the length of a longest shortest path in G, that 
is, max{distg(u, v) | u,v eV}. 


Due to its importance, DIAMETER is extensively studied. Concerning worst- 
case analysis, the theoretically fastest algorithms (in terms of dependence on the 
number n of vertices) are based on matrix multiplication and run in O(n??3) 
time [Sei95]. In terms of the dependence on the input size n + m, the currently 
fastest algorithms for ALL-PAIRS SHORTEST PATHS run in O(n? /2°(vV!oen)) time 
in dense graphs [CW21] and in O(nm) time in sparse graphs, respectively. 
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Figure 3.1: An undirected connected graph G = ({s,t, u,v, w}, E). The diameter of G 
is 3 as dista (u, v) = 3 and dista (x,y) < 3 for all x,y € {s,t, u,v, w}. 


The O(nm)-time algorithm performs a breadth-first search from each vertex 
and algorithms for DIAMETER employed in practice are usually based on this ap- 
proach. See e. g. Borassi et al. [Bor+15] for a recent example of such an algorithm 
which also yields good performance bounds using average-case analysis [BCT17]. 

Concerning special graph classes, Gawrychowski et al. [Gaw+21] showed 
how to solve DIAMETER on planar graphs in O(n”’®) time. Other special 
cases include linear-time algorithms for outerplanar graphs [FP80] and chordal 
graphs [Cor+01]. 

In this chapter, we follow the line of FPT in P [GMN17]. Starting FPT in P 
for DIAMETER, Abboud et al. [AVW16] observed that, unless the SETH fails, 
there is no KO . (n + m)?”*-time algorithm for DIAMETER for any e€ > 0 if k is 
the treewidth of the graph. Their corresponding reduction also shows the same 
hardness result for the combined parameter h-index plus domination number and 
the parameter vertex cover number. Moreover, the reduction also implies that 
the SETH is refuted by any f(k)(n + m)?~*-time algorithm for DIAMETER for 
any computable function f and any £ > 0 when k is the distance to chordal 
graphs. Evald and Dahlgaard [ED16] adapted the reduction to prove the same 
for the parameter maximum degree. 

Complementing the lower bound for the parameter treewidth by Abboud et al. 
[AVW16], Bringmann et al. [BHM20] showed that DIAMETER can be solved 
in 2°()n!+°() time where k is the treewidth of the graph. In the paper on 
which this chapter is based, we systematically explored the parameter space 
looking for parameters that allow for kO U . (n + m)?"-time algorithms [BN19]. 
Figure 3.2 gives an overview over the parameterized results for DIAMETER 
and we will present in this chapter some selected results we achieved. The 
following results on approximating DIAMETER are known. A simple breadth- 
first search yields a linear-time 2-approximation. Aingworth et al. [Ain+99] 


27 


Max. Degree 
+ Dom. No. 


h-index + Max Degree Bisection 
Dom. No. + Diameter Width 


Vertex Cover 
Number 


| 


I 


Feedback 
Edge Number 


Ac. Chrom. No. h-index + Maximum 
+ Dom. No. Diameter Degree 


Distance to 
Clique 


| 


Treewidth ‘Ac. Chrom. No.) h-index 
+ Diameter 


Distance to 
Interval 


Domination Distance to Distance to Acyclic 
Number Cograph Chordal Chromatic No. 
Max Diameter Distance to 
of Components Bipartite Degree 
Girth Distance to Chromatic Minimum 
Perfect Number Degree 


Figure 3.2: Overview of the relation between the structural parameters and the 
respective results for DIAMETER. An edge from a parameter a to a parameter 8 below 
of a means that 8 can be upper-bounded in a polynomial (usually linear) function 
in a (see also the work by Schröder [Sch19]). The three small boxes below each 
parameter indicate whether there exists (from left to right) an algorithm running 
in f(k)n?, f(k)(n+m)!**, or kK? (n+ m)'** time, respectively. If a small box 
is green (lighter), then a corresponding algorithm exists and the box to the left is 
also green. Similarly, a red (darker) box indicates that a corresponding algorithm 
would be a breakthrough. More precisely, if a middle box (right box) is red, then an 
algorithm running in f(k) - (n + m)?~* (or k°™ . (n + m)?~*) time refutes the SETH. 
If a left box is red, then an algorithm with running time f(k)n? implies an O(n?)-time 
algorithm for DIAMETER in general. Hardness results for a parameter a imply the same 
hardness results for the parameters below a. Similarly, algorithms for a parameter 8 
imply algorithms for the parameters above 3. White boxes indicate open problems. 


improved the approximation factor to 3/2 at the expense of the higher running 
time of O(n? logn + mynlogn). Roditty and Williams [RW13] showed that 
approximating DIAMETER within a factor of 3/2 — ô in O(n?~*) for any ĝ,€ > 0 
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time refutes the SETH. Moreover, for any &,6 > 0 a (3/2 — 6)-approximation in 
O(m?-®) time or a (5/3 — ö)-approximation in O(m?/?-*) time also refute the 
SETH [Bac+18, CGR16]. On planar graphs, there is an approximation scheme 
with near linear running time [WY16]. 


3.2 Parameters Motivated by Graph Classes 


In this section, we investigate parameterizations that measure the distance to 
special graph classes. We study the odd cycle transversal number (the distance 
to bipartite graphs) and the distance to cographs. Roditty and Williams [RW13] 
state that when computing the diameter of a graph, then distinguishing between 
diameter two and diameter three is among the most difficult cases. Nonetheless, 
detecting cographs (a subclass of graphs with diameter two) is easier than com- 
puting the diameter. Moreover, if the graph contains only few pairs of vertices 
of distance at least three, then the distance to cographs is often small. Thus, 
an efficient algorithm for DIAMETER, parameterized by the distance to cographs 
might help dealing with the hard case of DIAMETER stated above. Besides 
cographs, we study bipartite graphs as these are among the most fundamental 
graph classes. Note that the lower bound of Abboud et al. [AVW16] for the 
parameter vertex cover number (distance to edgeless graphs) already implies 
that there is no k? . (n + m)?”*-time algorithm for k being either of the two 
considered parameters as both are upper-bounded by the vertex cover number 
(see Figure 3.2). 


Odd Cycle Transversal. We will show that, assuming the SETH, there is 
no f(k)-(n-+m)?~£-time algorithm for the odd cycle transversal number k for any 
computable function f. We do so by showing that DIAMETER is 4-GP-hard! with 
respect to the combined parameter odd cycle transversal number plus girth. Recall 
that the girth of a graph is the size of a smallest induced cycle in the graph and 
that DIAMETER is 4-GP-hard with respect to the parameter odd cycle transversal 
number plus girth if the following holds. Each instance (G,k) of the decision 


lWe remark that Definition 2.1 and Lemma 2.3 are stated for decision problems while 
DIAMETER is not a decision problem. However, the problem of deciding whether a given 
undirected connected graph has diameter exactly k for some given k is a decision problem 
and every algorithm for DIAMETER can be used to solve this decision problem with constant 
overhead. We call DIAMETER GP-hard with respect to some parameter, when this decision 
version of DIAMETER. is GP-hard with respect to that parameter. 
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Figure 3.3: Example for the construction in the proof of Proposition 3.1. The input 
graph given on the left side has diameter two and the constructed graph on the right 
side has diameter three. In each graph one longest shortest path is highlighted. 


version of DIAMETER can be transformed in O(|G|) time into an instance (G’, k’) 
such that the odd cycle transversal number plus girth of G” is at most four and G” 
has diameter k’ if and only if G has diameter k. Recall further that if DIAMETER 
is 4-GP-hard with respect to some parameter £, then Lemma 2.3 states the 
following. If there is an algorithm solving each instance (G, k, £) of the decision 
version of DIAMETER in O(f(£) - g(|G|)) time for any computable functions f 
and g, then there is an algorithm that solves each instance (G’,k’) of the 
unparameterized decision version of DIAMETER in O(g(|G’|)) time. This then 
yields the following two results. First, any f(k) -n?3-time algorithm can be 
transformed into an O(n?'*)-time algorithm for DIAMETER (which is faster than 
any known unparameterized algorithm). Second, any f(k) (n + m)?~*-time 
algorithm would refute the SETH. 


Proposition 3.1. DIAMETER is 4-GP-hard with respect to the combined pa- 
rameter odd cycle transversal plus girth. 


Proof. Let G = (V,E) be an arbitrary undirected connected input graph 
with V = {v1,v2,...,Un}. We construct a new bipartite graph G’ = (V’, E”), 
where 

V' := {uj, wi | vi € V}, and 


E' = {{u;, w;}, {uj, wi} | {vi vj} e EYU {{ui, wi} | v € V}. 


An example of this construction can be seen in Figure 3.3. We will now prove 
that all properties of Definition 2.1 hold. It is easy to verify that the construction 
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can be computed in linear time and therefore the resulting instance is of linear 
size as well. Observe that {u; | vi € V} and {w; | v; € V} are both independent 
sets and therefore G” is bipartite. Notice further that for any edge {v;,v;} € E 
there is an induced cycle in G” containing the vertices u;, wi, uj, and wj. 
Since G’ is bipartite, there is no induced cycle of length three in G’ and thus 
the girth of G’ is four. 

Lastly, we show that the diameter of G’ is exactly one larger than the diameter 
of G. We do so by proving for each pair (v;, vj) of vertices in G that if dist(v;, vj) 
is odd, then 


dist(u,;, wj) = dist(v;, vj) and dist(u;,u,;) = dist(v;, vj) + 1, 
and if dist(v;,v;) is even, then 
dist(u;, uj) = dist(v;, vj) and dist(u;, wj) = dist(v;, vj) + 1. 


Since dist(u;,w;) = 1 and dist(u;, wj) = dist(u,;,w,), this will conclude the 
proof. 

In order to show that the diameter of G’ is exactly one larger than the 
diameter of G, let c = dist(v;, vj) be odd and let P = (Vag, Vai,- --, Ua, ) be a 
shortest path from v; to vj where Vag = v; and Va, = vj. Let 


/ 
P = (Ua; War; Wags: ++) Wag) 


be a path in G’. Clearly P’ has length c and hence dist(u;, wj) < c = dist(v;, v;). 
It also holds that dist(u;, wj) > c. To verify this, assume towards a contradiction 
that there is a path P” = (Uba, Wbs, Ubas- --, Wb) With un, = Ui, We, = Wj, 
and € <c. Then there is a path P! = (Ub), Ubi; ---, Ub) between v; and vj. 
Note that if ve, = vp, for some bg < bp, then g = h and P” can be replaced by 
a shorter path where the subpath P’’[b,+1,bn] is removed. Thus, the distance 
between v; and vj is shorter than c, a contradiction. 

Concerning dist(u;,u,;), observe that G” is bipartite and hence dist(u;, uj) 
is even. It holds that dist(u;,u;) > c as dist(u;,u;) > c for the same reason 
as dist(u;, wj) > c and dist(u;, uj) # ¢ as c is odd. Finally, since 


P'o (uz) = (Deg, Wars Uggs sg Wei) 


is a path of length c+1 between u; and ua, = uj, it holds that dist(u;, uj) = c+ 1. 
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It remains to analyze the case where the distance c = dist(v;, vj) between 
two vertices in G is even. Let again P = (Vao, Vais- --, Va.) be a shortest path 
from v; to vj where va, = vi and vg, = vj. This time, let 


/ 
P = (tags Wags tages tge) 


be a path in G’. This shows that dist(u;, wj) < dist(v;,v;). It again holds 


that dist(u;, wj) > c as if there would be a path P” = (Ubs, Wb, Uba,- -, Ubu) 
with up, = Ui, Ub, = uj, and d <c, then there would also be a shorter 
path P” = (Ub, Ub,,---,Ub,,) between v; and vj. Observe that dist(u;, wj) 


is odd as G” is bipartite. Thus, dist(u;,w;) > c as dist(u;,w;) < c again 
implies dist(v;,v;) < c and dist(u;, wj) # c as one is odd and the other one is 
even. Finally, 

P' e (wj) "(Wiggs War, Uaz, is > , Uacy Wac) 


proves that dist(u;, wj) = c + 1 = dist(v;, vj) + 1. 


Distance to cographs. We continue with the distance to cographs. A graph 
is a cograph if and only if it does not contain a Pı as an induced subgraph, where 
a Pı is a path on four vertices. Providing an algorithm that matches the lower 
bound of Abboud et al. [AVW16], we will show that DIAMETER parameterized 
by distance k to cographs can be solved in O(k - (n + m) + 20°™)) time. We will 
use the following lemma. 


Lemma 3.2. Let G = (V, E) be a graph and let K C V a verter subset such 
that each connected component in G — K has diameter at most two. Then, the 
diameter of G can be computed in O(|K| - (n + m + 24141)) time. 


Proof. Let G = (V, E) be the input graph, let K = {z1, £2,..., £k} CV bea 
set of vertices such that each connected component in G has diameter at most 
two and let G” := G — K. We first compute the set of all connected components 
of G’ and their respective diameter in linear time and store for each vertex the 
information in which connected component it is contained. Note that we only 
need to check for each connected component C whether C induces a clique in G”, 
as otherwise C’s diameter is by assumption two. In a second step, we perform 
from each vertex x; € K a breadth-first search in G and store the distance 
between x; and each other vertex v in a table. Since a single breadth-first search 
takes O(n + m) time, this takes overall O(k - (n + m)) time. 

Next we introduce some notation. The type of a vertex v € V \ K is a vector 
of length k where the jth entry describes the distance from v to x; with the 
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Figure 3.4: An example for types. The set K contains the two vertices xı and x2 and 
the connected components in G — K are depicted. The type of r is (1,3), the type 
of s is (2,4), the type of t is (3,4), the type of u is (1,2), the type of v is (1,1), and 
the type of w is (2, 2). 


addition that any value above three is set to four. An example is given in 
Figure 3.4. We say that a type is non-empty if there is at least one vertex of this 
type. We compute for each vertex ve V \ K its type. Additionally we store for 
each non-empty type the vertices of this type. Moreover, if all vertices of this 
type are in the same connected component, then we store this information, and 
otherwise we store that there are at least two different connected components 
containing a vertex of that type. This takes O(n - k) time and there are at 
most 4” different types. 


Lastly, we iterate over all of the at most 4?" pairs (tı, t2) of non-empty types 
(including the pairs where both types are the same) and compute the largest dis- 
tance between vertices of these types. Let y, z be two vertices with type(y) = ti 
and type(z) = t2 that have maximum pairwise distance. We will first discuss 
how to find y and z and then show how to correctly compute their distance 
in O(k) time. Once we iterated over all pairs of types and reported the maxi- 
mum distance found, the diameter is either this or the largest distance from a 
vertex xz; € K. Since we stored all of the latter distances in a table, we can also 
store the maximum with only constant overhead. 


To compute y and z, we consider the following two cases. If both types only 
appear in the same connected component, then the distance between the two 
vertices of these types is at most two. Hence, we can discard this case (one can 
check in linear time whether the diameter of G is at least two). If two types 
appear in different connected components, then a longest shortest path between 
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vertices of the respective types contains at least one vertex in K. Observe that 
since each connected component has diameter at most two, each third vertex in 
any shortest path must be in K. Thus a shortest y-z—path contains at least one 
vertex x; € K with dist(z,,y) < 3. By definition, each vertex with the same 
type as y has the same distance to x; and therefore the same distance to z 
unless there is no shortest path from it to z that passes through x;, that is, it 
is in the same connected component as z. Hence, we can choose two arbitrary 
vertices of the respective types in different connected components. Observe that 
we already precomputed for each type its vertices and whether it is represented 
in multiple connected components or not. Thus, checking whether there are 
two vertices of the respective type in different connected components is just 
a table lookup. We can compute the distance between y and z in O(k) time 
by computing min,¢x« {dist(y, x) + dist(x, z)}. Observe that the shortest path 
from y to z contains x; and therefore dist(y,x;) + dist(x;,z) = dist(y, z). In 
this way, we can compute the diameter of G in O(k- (n+ m + 2**)) time. 


Note that the algorithm described in the proof above does not verify whether K 
is a vertex set such that each connected component in G — K has diameter 
at most two. Indeed, even distinguishing between diameter two and three 
in O(n?”°) time for any £ > 0 would refute the SETH [AVW16]. Thus, the 
above algorithm cannot efficiently verify whether the input meets the stated 
conditions. Hence, when using Lemma 3.2, we need a way to ensure that each 
connected component in G — K has diameter two. In cographs each connected 
component has diameter two and hence we can show the following. 


Proposition 3.3. DIAMETER can be solved in O(k- (n +m + 21% )) time when 
parameterized by the distance k to cographs. 


Proof. Recall that a cograph does not contain a P4 as an induced subgraph. 
Thus, any cograph has diameter at most two (but not every diameter-two 
graph is a cograph, consider e. g. a cycle on five vertices). Moreover, given a 
graph G, one can determine in linear time whether G is a cograph and can return 
an induced Pı if this is not the case [Bre+08, CPS85]. Iteratively searching 
for an induced P4, adding all four vertices of a returned Pı to a set K, and 
deleting those vertices from G until it is P,-free hence computes a set K C V 
with |K| < 4k such that G—K is a cograph. The running time for computing K 
is in O(k- (n + m)). Applying Lemma 3.2 to this set K then yields a running 
time of O(|K]|- (n + m + 24IK|)) C O(k- (n+ m + 2!°%)) for computing the 
diameter. 
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Observe that when a minimum deletion set K to cographs is given, then we can 
solve DIAMETER parameterized by the distance k to cographs in O(k-(n+m+2**)) 
time. We remark that computing the distance to cographs exactly is NP- 
complete [LY 80]. 


3.3 Parameters Motivated by Properties of 
Social Networks 


In this section, we study DIAMETER with respect to parameters that are expected 
to be small in social networks. It was observed that social networks have the 
small-world property and a power-law degree distribution [LH08, Mil67, New03, 
New10, NP03]. The small-world property directly transfers to the diameter. 
The power-law degree distribution is often captured by the h-index as only 
few high-degree vertices exist in the network. Thus, we investigate parameters 
related to the diameter and to the h-index. We start with some degree-based 
parameters that are upper-bounded by the h-index and then continue with 
parameter combinations. 

Evald and Dahlgaard [ED16] showed that any f(k)(n +m)?~*-time algorithm 
for DIAMETER parameterized by the maximum degree k for any computable 
function f refutes the SETH. Observe that 2m = n-a, where a is the average 
degree and therefore the standard algorithm (run a breadth-first search from 
each vertex) takes O(n - (n + m)) = O(a-n?) time. Since the average degree is 
at most the maximum degree, this algorithm already matches the given lower 
bound. 


Observation 3.4. DIAMETER parameterized by average degree a is solvable 
in O(a- n?) time. 


We next investigate the parameter minimum degree and check whether the 
average degree can be replaced by the minimum degree. Unsurprisingly, it cannot. 
We show that DIAMETER is 2-GP-hard with respect to the combined parameter 
bisection width plus minimum degree. In other words, if there is an f(b) - n?-time 
algorithm, where b is the value of the combined parameter, then there is also 
an O(n?)-time algorithm for DIAMETER. The bisection width of a graph G 
is the minimum number of edges to delete from G in order to partition G 
into two connected component whose number of vertices differ by at most one. 
Computing the bisection width of a graph is known to be NP-hard [Bui+87). 
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Figure 3.5: Example for the construction in the proof of Proposition 3.5. The input 
graph given on the left side has diameter two and the constructed graph on the right 
side has diameter 2 + 4 = 6. The respective longest shortest paths are highlighted. 


Proposition 3.5. DIAMETER is 2-GP-hard with respect to the combined pa- 
rameter bisection width plus minimum degree. 


Proof. Let G = (V, E) be an arbitrary undirected connected input graph 
with V = {v1,v2,-..,Un} and let d be the diameter of G. We construct a 
new graph G’ = (V’, E’) with diameter d + 4 as follows. Let 


V’ := {s;, ti, ui | i € [n]} U {w; | i € [3n]}, and 
E' := TUW U E”, where 
T = {{si, ti}, {ti ui} | i € [n]}, 
W := {u, wi} U {{w1, wi} | i € ([3n] \ {1})}, and 
E” = {{uj, uj} | {vi vj} € E}. 


An example of this construction can be seen in Figure 3.5. We will now prove 
that all properties of Definition 2.1 hold. It is easy to verify that the graph G” 
contains 6n vertices and 5n + m edges, and that G” can be computed in linear 
time. Notice that {s;, ti, u; | i € [n]} and {w; | i € [Bn] } are both of size 3n and 
that there is only one edge ({u1, w1 }) between these two sets of vertices. The 
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bisection width of G” is therefore one and the minimum degree is also one as sı 
has only tı as neighbor. 

It remains to show that G’ has diameter d+4. First, notice that the subgraph 
of G’ induced by {u; | i € [n]} is isomorphic to G. Second, dist(s;,u;) = 2 for 
all i € [n] and thus dist(s;, sj) = dist(u;, uj) +4 = dist(v;, vj) + 4 for all s; 4 sj. 
Hence, the diameter of G’ is at least d+ 4. Third, note that it holds for 
all vertices x € V’ \ {s;} that dist(s;, x) > dist(t;, x). Lastly, observe that for 
all i € [3n] and all vertices x € V” it holds that dist(w;, £) < max{dist(s1, £), 4}. 
Thus the longest shortest path in G” is between two vertices s; and sj and it is 
of length dist(u;, uj) + 4 = dist(v;,v;) +4 < d + 4. 


We mention in passing that the constructed graph in the proof of Proposi- 
tion 3.5 contains the original graph as an induced subgraph and if the original 
graph is bipartite, then so is the constructed graph. Thus, first applying the 
construction in the proof of Proposition 3.1 (see also Figure 3.3) and then the 
construction in the proof of Proposition 3.5 (see also Figure 3.5) shows that 
DIAMETER is GP-hard even when parameterized by the sum of girth, bisection 
width, minimum degree, and odd cycle traversal. 


Corollary 3.6. DIAMETER is 6-GP-hard with respect to the combined parameter 
odd cycle traversal number plus girth plus bisection width plus minimum degree. 


h-index and diameter. We next investigate the combined parameter h- 
index plus diameter. The reduction by Roditty and Williams [RW13] produces 
instances with constant domination number and logarithmic vertex cover number 
(in the input size). Since the diameter d is linearly upper-bounded by the 
domination number and the h-index is linearly upper-bounded by the vertex cover 
number, any algorithm that solves DIAMETER parameterized by the combined 
parameter (d+ h) in (d+ h)O®) . (n+ m)?“ time disproves the SETH. We next 
present the main result in this chapter, that is, an algorithm for DIAMETER 
parameterized by h-index plus diameter that almost matches the lower bound. We 
say that the running time almost matches the lower bound since its dependence 
on the parameter is roughly O(h + d”) = O(2@!esh+hlogd), Hence, it remains 
open whether an algorithm with a running time of (n + m) - 20(4+") exists. We 
consider the following algorithm our main result of this chapter for two reasons. 
First, its running time almost matches the lower bound for the relevant special 
case where the input graph has similar properties to social networks (namely 
small diameter and small h-index). Second, the dynamic program we develop 
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here is quite unique in the sense that the solution of the problem is not related 
to some table entry but rather to the size of the table. 


Theorem 3.7. DIAMETER parameterized by diameter d plus h-Index h is solvable 
in O((n+ m): h- (h? + .d")) time. 


Proof. Let G = (V, E) be an input graph for DIAMETER and let H = {xı,...,xn} 
be a set of h vertices with highest degree in G. Clearly, H can be computed in 
linear time. Notice that all vertices in V \ H have degree at most h in G. 


We will describe a two-phase algorithm based on the following idea. In the 
first phase, it performs a breadth-first search from each vertex x; € H, stores 
the distance to each other vertex, and uses this to compute the type of each 
vertex, that is, a vector containing the distances to each vertex in H. In the 
second phase, the algorithm iteratively increases a value e and verifies whether 
there is a pair of vertices of distance at least e + 1 using dynamic programming. 
If at any point no such pair is found, then the diameter of G is e. 

The first phase is fairly straightforward. The algorithm performs a breadth- 
first search from each vertex x; € H and stores the distance from x; to each 
vertex v in a table. We denote the maximum entry in this table by a. It then 
iterates over each vertex v € V \ H and computes a vector of length h with 
the it? entry representing the distance from v to 2;. An example of types is 
depicted in Figure 3.6. The algorithm also stores the number of vertices of each 
type (if there is at least one such vertex). Since the distance to any vertex is at 
most d, there are at most d” different types. Let T be the set of all (non-empty) 
types and for some t € 7 let #+ be the total number of vertices of type t. 

For the second phase, we deploy a dynamic program that uses two tables N 
and T. The table N: V x IN — 2” keeps for each vertex v and each possible 
distance e track of all vertices that have distance exactly e in G” = G — H. The 
table T: V x IN x T — N stores for each vertex v, each distance e, and each 
type t the number of vertices of type t that have distance at most e from v in G. 
Initially e = 1, N[v,0] = {v}, and N[v,1] = N(v) for each vertex v. Before 
we show how to initialize T, we explain the main idea behind it. Note that a 
shortest path between v and a vertex w of type t either contains a vertex in H or 
it is completely contained in G”. If it contains a vertex x; € H, then the distance 
between v and w is dist(v, 2;) + dist(x;,w). Hence, assuming that a shortest 
path contains a vertex in H, the distance between v and w is the minimum entry 
in type(v) + type(w) = type(v) + t. We denote this minimum entry by mt(v, t). 
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Ti T2 T3 
Tı 0 1 1 
T2 1 0 1 
z3 | 1 1 0 
u 1 1 2 
v 1 2 1 
w 2 1 1 


Figure 3.6: An example of types. Each entry in the table on the right side displays the 
distance between the two respective vertices. Each column is computed by a breadth- 
first search from the respective vertex x; and each row is the type of the respective 
vertex. The last row states for example that the distance between w and x1, £2, and 73 
are 2, 1, and 1, respectively. Thus, the type of w is (2,1,1)”. 


Since a path of length zero or one between two vertices v,w € V \ H cannot 
contain a vertex in H, the table can be initialized by 


1, if type(v) =t, 
0, otherwise, and 


T[v, 0, t] = 
Tv, 1,t] = T[v, 0, t] + {u € NIv, e] | type(u) = t}. 


The algorithm now iteratively increases e and computes N[v, e] and Tv, e,t] 
for each v € V \ H and each t € T until in one iteration T|v, e,t] = #+ for all v 
and all t. Once this is the case, all vertices in V \ H have pairwise distance at 
most e. Since we already computed the distance from each vertex x; € H to 
each other vertex, the maximum over all these distances and e is the diameter 
of G. The recursive formulas for N and T are as follows. 


N{v, e] = |( U N(u)) \ (N[v,e — 1] U Nv, e — 2])| and 


uEN[v,e—1] 
T lw, e, t Ft, if mt(v,t) < e, and 
v,e,t] = 
T[v,e— 1,t] + |{u € N[v, e] | type(u) = t}| otherwise. 


If at some iteration it holds that T[v,e,t] = #; for all vertices v and all 
types t, then the algorithm terminates and returns max{e,a}. Observe that e 
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is equal to the number of table entries in T divided by |V \ H|- |7|. Thus, the 
solution returned by the dynamic program is not depending on any value stored 
within T but rather on the number of table entries in (the size of) T. 

There are at most d iterations in which e is increased and table entries of N 
and T are computed. Note that all values of the function mt can be precomputed 
in O(|T|? - h) < O(d"-h-n) time as |T| < d” and |T| < n. Note that the 
computation of N closely resembles a breadth-first search in G’ and since the 
maximum degree in G’ is h and the maximum depth is d, computing all entries 
of N for a single vertex takes O(h“) time. To compute all entries T[v, e,t] for 
all t € T simultaneously, we iterate over each vertex w € N|v,e] and increase 
the entry T[v,e,type(w)] by one. This takes O(J_, |N[v, e]]) < O(h“) time 
for each vertex. The running time of our algorithm is O(h- (n + m)) for the first 
phase and O((d" -h - n) +n- ht) for the second phase. This yields an overall 
running time of 


O((n+m+h)-d*4+n-h*) C Ol(n+m):h-(h?+d®)). 


Acyclic chromatic number and domination number. Finally, we ana- 
lyze the parameterized complexity of DIAMETER parameterized by the acyclic 
chromatic number a plus domination number d. Note that this combined pa- 
rameter is incomparable with the combined parameter h-index plus diameter 
as the h-index upper-bounds the acyclic chromatic number but the domination 
number upper-bounds the diameter. Recall that the acyclic chromatic number of a 
graph G is the smallest number a such that the vertices of G can be partitioned 
into a independent sets such that the induced subgraph of each combination 
of two of these independent sets is acyclic. We provide a SETH-based lower 
bound, adapting a reduction from SATISFIABILITY to DIAMETER by Roditty 
and Williams [RW13]. 


Proposition 3.8. There is no f(a,d) -(n + m)?~‘-time algorithm for any 
computable function f that solves DIAMETER parameterized by acyclic chromatic 
number a plus domination number d unless the SETH is false. 


Proof. We provide a reduction from SATISFIABILITY to DIAMETER where the 
input instance has constant acyclic chromatic number and constant domination 
number and such that an O((n + m)?~*)-time algorithm refutes the SETH. We 
note that the reduction is an extension of the construction by Roditty and 
Williams [RW13, Theorem 9]. Let & be a SATISFIABILITY instance with variable 
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set W and clause set C. Assume without loss of generality that |W] is even. 
We construct an instance graph G = (V, E) for DIAMETER as follows. 

Randomly partition W into two sets Wı and Wa of equal size. Add three 
sets Vi, V2, and B of vertices to G, where each vertex in Vı (in V2) represents 
one of 2'"/? possible truth assignments of the variables in W, (in W2) and each 
vertex in B represents a clause in C. Clearly, |Vi|+|V2| = 2-2”? and |B| = |C]. 
For each v; € V; and each u, € B, if the truth assignment corresponding to vi 
does not satisfy the clause corresponding to uj, then we add a new vertex sij 
and the two edges {v;, sij} and {u,, sij} to G. We call the set of all these newly 
introduced vertices S1. Now repeat the process for all vertices w; € Va and 
all u; in B and call the newly introduced vertices qij. Let S2 be the set of all qij. 
Finally we add four new vertices tı,ta,t3,tı and the sets 


{{ti,v} |v Ee Vif, 
{{t2,s} | s E€ Si}, 
{{ts, a} | g E€ S2}, 
{{ta, w} | w E Vo}, 
{{t2, b}, {ts, b} | be B}, and 
{{t1, t2}, {t2; ta}, (fa, ta}} 


of edges to G. See Figure 3.7 for a schematic illustration of the construction. 

We will first show that ¢ is satisfiable if and only if G has diameter five and 
then show that the domination number and acyclic chromatic number of G are five 
and four, respectively. Observe that the diameter of G is at most five since each 
vertex is connected to some vertex in {t1, t2, t3, t4} and these four vertices are of 
pairwise distance at most three. First assume that ¢ is satisfiable. Then, there 
exists some truth assignment ß of the variables such that all clauses are satisfied, 
that is, the two partial truth assignments of 6 with respect to the variables 
in Wı and W3 satisfy all clauses. Let vı € Vı and vg € V2 be the vertices 
corresponding to 3. Thus, for each b € B we have dist(v 1, b) + dist(va,b) > 5. 
Observe that all paths from a vertex in Vı to a vertex in V2 that do not pass a 
vertex in B pass through tə and t3 and are hence of length at least five. Thus, 
the diameter of G is dist(v1, v2) = 5. 

For the reverse direction, assume that there is no satisfying truth assignment 
for ®. Then for each pair of vertices vı € V, and vg € Və it holds that there 
is some clause in ® that is not satisfied by either of the two partial truth 
assignments corresponding to vı and va. Hence, the vertex uj corresponding to 
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ty t4 


Figure 3.7: A schematic illustration of the construction in the proof of Proposition 3.8. 
A vertex s;,; is only connected to v; and u; and qij is only connected to w; and uj. 
Note that the resulting graph has acyclic chromatic number five (the five independent 
sets are Vı U V2, B, S1U S2U {t1, ta}, {ta}, and {ts} and are also represented by colors). 
Moreover, the domination number of the graph is at most four as {t1,t2,t3,t4} is a 
dominating set. 


this clause guarantees that dist(v1, v2) < dist(v1, uj) + dist(u;j, v2) = 4. Next, 
observe that each pair (v1, v2) of vertices where not both vı € Vi and v2 € Va 
(or vı € Va and v2 € Vi) holds are of distance at most four as guaranteed by 
the vertices t1, ta, t3, and t4. Thus, the diameter of G is four. 

The domination number of G is four since {t1, t2, t3,t4} is a dominating set. 
The acyclic chromatic number of G is at most five as Vi U Va, {te}, B, {t3}, 
and S1 U S2 U {t1, t4} each induce an independent set and each combination of 
two of them not including $,;US2U{t,, t4} only induces independent sets or stars. 
Moreover, note that S1 U S2 U {t1, ta} U {t2} and S1 U S2 U {t1, ta} U {t3} each 
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only induces a star and an independent set. Lastly, Sı U S2 U {t1, t4} UVi U V2 
induces two trees of depth 2 (where tı and t4 are the roots and Sı and S2 
are the leaves) and S1 U Sa U {tı,t4} U B induces a disjoint union of stars and 
isolated vertices as each vertex in S1 U Sa U {t1, t4} has maximum degree one 
in GIB U S1 U S2 U {t1, ta}. 

Now assume that there was an f(k)-(n-+m)?~£-time algorithm for DIAMETER 
parameterized by domination number plus acyclic chromatic number for any 
computable function f and any € > 0. The constructed graph has O(2'"/? - |C]) 
vertices and edges, and since f(9) is some constant, this implies an algorithm 
with running time 


KORO E 
e O(20W1/20-8) . |a=) 
= O(2'W1(1-€/2) . |0]2=9)) 
= 21W10- . (IC) + |W])°™ for some €’ > 0. 


Such an algorithm for DIAMETER would refute the SETH [RW13]. 


3.4 Concluding Remarks 


We conclude this chapter with some possible avenues for further research 
regarding DIAMETER. We believe that a broader reflection on the techniques we 
used (e.g. dynamic programming) is better deferred to the concluding chapter of 
this thesis, where we can compare the different dynamic programs we develop in 
this thesis. Concerning the complexity landscape shown in Figure 3.2, only a few 
open cases remain. Perhaps most interesting among them are the following two 
questions. Is there a kO“)-(n+m)!*©- time algorithm for the distance k to interval 
graphs and is there an f(d)n?-time algorithm for DIAMETER parameterized 
by the diameter d? Our algorithms working with parameter combinations are 
probably not competitive to state-of-the-art unparameterized algorithms due to 
their exponential dependency on the parameter value(s) even in graphs with 
properties similar to social networks and even so they cannot be improved by 
much unless the SETH breaks. So the question remains whether there are 
parameters kı,...,k, (that are possibly not displayed in Figure 3.2) that are 
small in real-world applications and that allow for practically relevant running 
times like Ii ki- (n+ m) or even (n+ m) - aan ki. A parameter capturing 
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the special community structures of social networks [GN02] might be a good 
candidate to be included in such a parameter combination. 
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Chapter 4 


Length-Bounded Cuts 


In this chapter, we investigate, on a conceptual level, a peculiar case of how 
to compute the solution for a problem, once the table of a dynamic program 
is completely filled. Similarly to Chapter 3, the first important question we 
have to answer is what a table entry should represent. Answering this question 
requires some structural observations and is by far the most complicated part 
of this chapter. However, once we have answered this question, determining the 
table dimension and computing each table entry are fairly straightforward while 
computing the solution from the filled table is not. 

The problem we study in this chapter stems from the area of network flows. 
The study of network flows and, in particular, of the EDGE-DISJOINT PATHS 
problem began in the 1950s with the work of Ford and Fulkerson [FF56] and 
has since then constituted a prominent research area in graph algorithms. In 
the EDGE-DISJOINT PATHS problem, we are given an undirected graph G, two 
vertices s and t, called the source and the target, and a positive integer 3. The 
question is whether there is a collection of at least 3 edge-disjoint s-t-paths in G. 
It is worth pointing out that nowadays there are many more efficient algorithms 
than the one by Ford and Fulkerson [FF56] for finding 8 edge-disjoint s-t-paths 
in a given graph (see e.g. the work by Dinitz [Din06]). 

A natural counterpart of EDGE-DISJOINT PATHS is EDGE CUT. Therein, 
the question is whether there is a set F of at most (6 edges such that there 
is no s-t-path in the graph after removing the edges in F. There is a strong 
dual relationship between EDGE-DISJOINT PATHS and EDGE CuT in the sense 
that, if both problems admit a solution for a given ß, then the value of ( is 
optimal, that is, it is not possible to find 6+ 1 edge disjoint s-t-paths and 
the removal of any set of 8 — 1 edges leaves s and t in the same connected 
component. Consequently, since EDGE CUT can be solved in polynomial time, 
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so can EDGE-DISJOINT PATHS. Quite naturally, there are many variants of 
the above described network flow/cut problems such as e. g. multicommodity 
flows, unsplittable flows, and the related cut problems (e. g. Schrijver [Sch03] 
provides further examples and formal definitions). Unlike EDGE-DISJOINT 
PATHS and EDGE CUT, it is not always the case that the respective flow and 
the cut problem belong to the same complexity class. We investigate a variant 
of EDGE CUT called LENGTH-BOUNDED CUT. It originates from network 
design and telecommunications and Gouveia et al. [GPS08], Huygens and Ridha 
Mahjoub [HR07], and Huygens et al. [Huy+07] describe further applications. 
LENGTH-BOUNDED CUT is an example where the cut problem is harder than 
the respective flow problem. LENGTH-BOUNDED CuT is NP-hard [Bai+10] 
while the respective flow problem is polynomial-time solvable [MM10]. 

Our main contribution in this chapter is a dynamic-programming-based 
polynomial-time algorithm for LENGTH-BOUNDED CUT on proper interval 
graphs. This confirms a conjecture by Bazgan et al. [Baz+19]. We conclude 
this chapter with showing some limitations of our approach when trying to 
adapt it for interval graphs. The existence of a polynomial-time algorithm for 
LENGTH-BOUNDED CUT on interval graphs was also posed as an open problem 
by Bazgan et al. [Baz+19]. 


4.1 Problem Definition and Related Work 


In this chapter, we study LENGTH-BOUNDED CUT, which is the cut problem 
related to the variant of EDGE-DISJOINT PATHS where an additional bound A 
is given and the sought collection of s-t-paths can only contain paths of length 
at most A. This problem has been introduced by Adámek and Koubek [AK71] 
and is formally defined as follows. 


LENGTH-BOUNDED CUT 

Input: An undirected graph G := (V, E), two vertices s, t, and two positive 
integers 6, A. 

Question: Is there a subset F C E with |F| < 6 such that there is no 
s-t-path of length at most A in G’ := (V, E \ F)? 


An example of LENGTH-BOUNDED CUT is given in Figure 4.1. For À = |V| one 
is left with the original problem EDGE CUT which is polynomial-time-solvable. 
LENGTH-BOUNDED CUT is also solvable in polynomial time if A < 3 [MM10]. 
However, Baier et al. [Bai+10] showed that LENGTH-BOUNDED CUT is NP-hard 
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Figure 4.1: An example graph. The dashed edges form a solution for LENGTH- BOUNDED 
Cut with 8 = 2 and A = 3. 


for A = 4. The related flow problem LENGTH-BOUNDED FLOW, where we 
restrict the flow to paths of length at most A, can be solved in polynomial time 
via a reduction to linear programming [Bai+10, KS06, MM10]. 

We note that the result of Baier et al. [Bai+10] in fact gives NP-hardness 
for LENGTH-BOUNDED CUT for each constant A > 4. Thus, in order to 
obtain tractability results, one presumably has to either consider a different 
parameterization or combine A with some other parameter. Golovach and 
Thilikos [GT11] first studied LENGTH-BOUNDED CUT from the viewpoint of 
parameterized complexity. They showed that LENGTH-BOUNDED CUT is fixed- 
parameter tractable for the combined parameter 3 + A. It is worth noting 
that the parameter 8 alone gives W/1/-hardness [GT11]. Later, Fluschnik 
et al. [Flu+18] proved that it is unlikely that a polynomial kernel in 8 + A 
exists. Dvořák and Knop [DK18] considered structural parameters for LENGTH- 
BOUNDED CUT. They showed that it is W/1/-hard when parameterized by 
the pathwidth of the input graph while it is fixed-parameter tractable when 
parameterized by the treedepth of the input graph. Kolman [Koll8] gave 
an O(\7 - (n+ m))-time algorithm for LENGTH-BOUNDED CUT, where T is 
the treewidth of G. Furthermore, LENGTH-BOUNDED CUT is fixed-parameter 
tractable for the parameter X if G is planar [Kol18] (it remains NP-complete 
on planar graphs [Flu+18]). Bazgan et al. [Baz+19] studied both restrictions 
on special graph classes as well as structural parameterizations for LENGTH- 
BOUNDED CUT. They provided an XP-algorithm for the maximum degree of the 
input graph G and fixed-parameter tractability for the feedback edge number. 
Furthermore, they presented a polynomial-time algorithm for co-graphs while 
showing NP-completeness even if the input is restricted to bipartite graphs or 
split graphs. Finally, LENGTH-BOUNDED Cut is W/1/-hard with respect to the 
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combined parameter pathwidth and maximum degree and with respect to the 
feedback vertex number [BHK20]. 


4.2 Polynomial-Time Algorithm for Proper 
Interval Graphs 


In this section, we present a polynomial-time algorithm for LENGTH-BOUNDED 
CUT on proper interval graphs. To this end, for each vertex v ¢ {s,t}, we define 
a set of vertices that contains v and t. The algorithm for LENGTH-BOUNDED 
CUT on proper interval graphs is then a dynamic program that stores for each 
vertex v and each possible distance d (2 < d < A) the minimum size of a cut 
that makes each vertex in the described set have distance at least d from s. 

Recall that each vertex v in a proper interval graph can be represented by 
an interval [b,, fy]2 such that two vertices u, w are adjacent in G if and only 
if [bu, ful? N [bw, fw]? A Ø and no interval representing a vertex is properly 
contained in the interval representing another vertex. Observe that we can 
assume without loss of generality that bs < by as we can otherwise “mirror” the 
graph by setting b, = —f, and f, = —b, for each vertex v € V. It is folklore 
that one can assume that |{b, | v € V}| = |V|. We further assume that the 
vertices in V \ {s,t} are named v1, v2,..-,Un—2 such that by, < by, for all i < j. 
We first show that we can safely ignore all vertices v with fu < bs or fr < by. It 
is worth noting that the following lemma holds for interval graphs and not only 
for proper interval graphs. 


Lemma 4.1. Let I = (G = (V,E),s,t, 8,2) be an instance of LENGTH- 
BOUNDED CUT where G is an interval graph and bs < b, in the interval 
representation. Let L := {u € V | fu < bs} and R := {ue V | fi < bu}: 
Then, I’ := (G-(LUR), s, t, 8, A) is an equivalent instance of LENGTH-BOUNDED 
CUT. 


Proof. Let I,T’,G,s,t,ß,A,L, and R be as defined above. We first show 
that I, = (G—R, s,t, 8, A) is an equivalent instance. Note that s,t ¢ LU Rand 
hence Izr and J’ are instances of LENGTH-BOUNDED CUT. Note further that 
deleting vertices from any input graph cannot decrease the distance between any 
pair of vertices and hence if J is a yes-instance, then so are Ir and J’. Hence it 
remains to show that if Iz is a yes-instance, then so is I. 

Assume towards a contradiction that Iņ is a yes-instance and J is a no-instance. 
Then there is a set Fz of 8 edges in G — R such that the distance between s 


48 


and tin Gp := (V \ R,E \ (Fr U {{u,v} € E | u € R})) is at least À + 1. 
Since J is a no-instance, there is a path P of length at most A between s and t 
in G* := (V, E \ Fr). As Gz and G* only differ in R, each path of length at 
most A between s and t in G* contains at least one vertex from R. We will show 
that degg(t) < |Fr| and hence there is an s-t-cut of size at most 6 in G and 
thus I is a yes-instance. This contradicts the assumption that I is a no-instance 
and hence finishes the proof that Izr is equivalent to T. 

We start by giving some basic notation for the proof to come. We use sets of 
vertices that have a certain distance from s in some subgraph H of G. To this end, 
we define X?, := {u € V | dista (s, u) = p} for each distance p. Analogously, we 
define XP := {u € V | disty(s,u) < p} and XZ? := {u € V | distg(s, u) > p}. 

Let d := distg» (s, t) and let t be the vertex in P with maximum by. Since P 
contains a vertex from R, it holds that by > f; and hence t  Na(t). Since t 
is on a shortest s-t-path in G* and t’ ¢ Net), it holds that t € 2 Now 
consider the set K of vertices that are part of a shortest s-t'-path in G* and that 
are neighbors of tin G. By construction K C =® and for each y € [bi, fil® 
there is a vertex v € K with y € [by, fu]®: We next show that |Fr| > degg(t). 
To this end, consider any vertex u € Nalt). If u € X§f-?, then it holds 
that {u,t} € Fr. Otherwise u € X% !. Observe that for each u € Ne{t) it 
holds by definition that there is a y € [bu, ful? N [br, f:]2 # Ø and hence there is 
a vertex v € K with y € [b,, fy]®& and hence {u,v} € E. Note that u € Rn 


andve KC raS, Since 
dista» (s, u) > d— 1 > d— 3+1 > dista» (s, v) +1, 


it holds that {u,v} € Fr. Since {v,t} € Fr for all v € Ng(t) A KS, 
since for all v € Na(t) A eae there is some w € K such that {v,w} € Fr, 
and since KN Kot =), there is a unique edge for each v € Nc(v) in Fy. 
Hence, 8 = |Fz| > degg(t) and thus there is a trivial s-t-cut of size 6 in G that 
contains all edges incident to t. Thus, J is a yes-instance. 

We conclude the proof by showing that I’ is equivalent to Iz. Note that we 
consider undirected graphs and hence we can exchange the roles of s and t and 
mirror the graph by setting b!, := —f, and ff := —b, for each vertex v € V. 
Note that all vertices in L (originally fulfilling fu < bs) now satisfy b, > ff. 
Hence, if we interchange the names of s and t, then they satisfy the condition 
of R and hence we can use the argument above to show that J’ and I; are 
equivalent. 
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Using Lemma 4.1, we can always assume that there is no vertex v with fy < bs 
or by > fi. We next show that if there is a solution, then there is also a solution 
in which the distance from s to vj is non-decreasing in j. 


Lemma 4.2. Let G = (V,E) be a proper interval graph where no vertex v 
satisfies fı <b, or bs > fy and let F be a set of edges. Let d be the distance 
from s tot in G’ :=(V,E\ F). Then, there is a set F’ of edges with |F’| < |F| 
such that distg”(s,t) > d in G” := (V, E \ F’) and dista” (s, vi) < dista» (s, v;) 
for each vi, vj € V \ {s,t} with by, < by,. 


Proof. Let G,s,t, F,G', and d be as defined above. The main idea of this proof 
is to construct a sequence of graphs which starts with the graph G’ and ends 
with the sought graph G”. To this end, we define for each vertex v € V in a 
graph H = (V, Ep) a specific distance Dy (v). We define Dy (v) to be the length 
of a shortest path P = (s = uo, u1, U2, ..., Ua = v) from s to v in H such that 
for all y € [a — 1] it holds that bu, < bu,,,. As a special case, if ua = t, then 
we only require that for all y € [a — 2] it holds that bu, < bu,.,. We call such 
paths monotone, and if no monotone s-v-path exists, then we set Dy(v) := oo. 
Observe that for each graph H it holds that Dy(s) = 0 and Dy(v) > dist xz(s, v). 
Let G := {G* := (V, E*) | E* CEA|E*| > |E\ F|}. We present a sequence of 
graphs (G’ := G1, G2, ... Gk) such that 


(1) Ge = (V, Ec) € G for each ¢ € [k], 
(2) Da, (t) < Da, (t) for each £ € [k — 1], and 
(3) Da, (v) < Da,(w) for all v,w € V \ {s,t} with by < by. 


Claim 4.3. If such a sequence of graphs exists, then Fg = E\ Ep and G" := Gk 
satisfy Lemma 4.2. 


Proof of Claim 4.3. First, we show that dista, (s, v) = Da,(v) for all v. Assume 
towards a contradiction that there is a vertex v Æ t with distg,(s,v) # Dea, (v). 
Consider any shortest s-v-path P in Gk. Let w be the first vertex on P with 
distg, (s, w) # Da,(w) and let w’ be its predecessor. By this definition, it holds 
that dista, (s, w’) = De, (w’) and by < by: as otherwise distg, (s, w) = De, (w). 
Since Dg, (w) > distg,(w) and De, (w) 4 distg, (s, w), it follows that 


De, (w) > dist, (s, w) = distg, (s,w’) +1 = De, (w’) +1, 


a contradiction to (3) and by < bw. 
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Now assume that dist, (s,t) Æ Da, (t). Let P be a shortest s-t-path in Gy. 
Let v be the predecessor of t in P. We have shown that distg,(v) = Da,(v) 
and hence distg,(t) = distg,(v) + 1 = De, (v) +1 = De, (t). The last step 
follows from the fact that Dg, (t) < Da,(v) +1 as v is a neighbor of t in Gk 
and the special case in the definition of D that allows to ignore b;. 

The claim now easily follows. Note that (1) ensures that G, € G and 
hence |Fk| < F. It follows from (2) that 


dista,, (s, t) = Da, (t) = Da, (t) Ze Da, (t) = Da (t) = distq (s, t) = d. 
Finally, (3) states that for all v,w € V \ {s,t} with b, < by that 
dista» (s, v) = Dar (v) < Den (w) = distar (s, w). © 


We now describe how to obtain the sequence (G’ = G1, G2, ..., Gk) of graphs. 
To this end, we need a rather technical order over the graphs in G. We say 
that (V, Ea) = Ga <a Gy = (V, Ez) for Ga, Gy € G if and only if 


e |Eal>IE,|; 


e |Ea| = |E,| and there exists av € V \ {t} such that Dg, (v) < Da, (v) 
and Dg, (w) = Da, (w) for all w € V \ {t} with bù < by, or 


© |Eal = |Ey|, Daa (v) = De, (v) for all v € V \ {t}, and Dg, (t) < De, (t). 


Notice that <A defines a total preorder on G, that is, the order <, is tran- 
sitive, reflexive, and for each two graphs Ga, Gg € G with Ga # Gg it holds 
that Ga <a Gg or Ga <a Go. 

Let G; be a graph in the sequence. We will guarantee that each graph in 
the sequence fulfills (1) and (2). Consequently, if Ge satisfies (3), then we have 
found the last graph in the sequence. Otherwise, we describe how to obtain 
another graph G41 € G such that (2) holds for Ge and Gy41 and Gei <a Ge. 
Since <, is a total preorder, we can only build a finite sequence and hence at 
some point a graph has to satisfy (3). 

Since Ge = (V,E;) does not satisfy (3), there is some minimum j such 
that Da, (v;) > Dea, (vj+1)- Let 


X = {x € Ne(vj41) | (br < by, V æ= 8) A {vj 2} € B\ Ee A {vj 2} € Er}, 
Y = {y E€ Ne(v;) | (by > boy. Vy S E) A {u41 y} E E\ Ee {uyy} € Ee}. 


See Figure 4.2 for an example of X and Y. We distinguish between the two 
cases |X| > |Y| and |X| < |Y]. 
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Vj-3 
Uj—4 


Figure 4.2: An example for X and Y. Red (vertical) edges are contained in E \ Ee. 
For the sake of readability we do not depict edges in Ey. Note that X = {v;_2} 
and Y = {vj43}. 


Case 1 (|X| > |Y|): Let 


Eei = (Ee \ {{oj,y} ly € Y}) U {ivr} |x EX}, 


and Gey := (V, Ep41). Since |X| > |Y|, X NY = Ø, and G, € G, it holds 
that |Ee.i| > |Ee| > |E \ F| and thus Gyr € G. Clearly, for all v € V \ {t} 
with by < by,, we have De,,,(v) = Da, (v) as Ee+1 and E; only differ in 
edges incident to vj. Let w be the predecessor of vj41 in a shortest mono- 
tone s-v;41-path in Ge. Since Da, (vi) > Da,(vj+1) it holds that w £ vj 
and hence bw < by, < bv;,, < fw. Moreover, vertex w is contained in X as 
otherwise Dg, (vj) < Da, (w) + 1 = Da, (vj+1). Thus, 


De (v) = Da +1 (w) +1 = Da, (w) +1 = Da, (vj+1) < Da, (vi) 


and combined with |Ee+il > |El this yields Gopi <a Ge. 

It remains to show that Dg,(t) < Da,(t). Consider a shortest mono- 
tone s-t-path P in Ge+ı. If P does not pass through vj, then it is also a 
monotone s-t-path in Ge and hence Da,(t) < De,,,(t). If P passes through vj, 
then let z be the successor of v; in P. Note that if z € Y, then {vj41,2} € Ez, 
and if z ¢ Y, then z = vj41 or {vj+1, Z} € Ee as by, < bz. Hence it holds that 


Dez) < Dai (vj41) +1 = Da (v3) +1 = De+1(2). 
Finally, let P’ be a shortest monotone s-z-path in Gp and let P” = P’ e P|z,t]. 


Note that P” is a monotone s-t-path of length at most Dg,,,(t) in Ge and 
thus De, (t) < De, (t). 
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Case 2 (|X| < |Y|): We set 


Foss = (Ee \ {{vj 2} | £ E X} U (fuji, yt lye YH. 


Since |X| < |Y|, XAY = ý, and G; € G, it holds that |Eo41| > |Ee| and 
therefore Ge41 € G and Ge41 <a Ge. It remains to show De,(t) < Da... (t). 

Since Ex and E+, only differ in edges incident to v;+1, for all v with by < by, 
it holds that Dg,,,(v) = De,(v). Let P be a shortest monotone s-v;, -path 
in Ge+ı and let w be the predecessor of v;+ı in P. By definition of Ke+ it 
holds that w = v; or {w,v;} € Ee and hence Dg, (vj) < Da,,,(vj41). Let P’ 
be a shortest monotone s-t-path in Ge+ı. If P’ does not pass through v;+1, 
then it is also a monotone s-t-path in Gy as Ee and E+, only differ in edges 
incident to v;,1 and hence Dg,(t) < De,,,(t). If P’ passes through v;+1, then 
let z be the successor of vj; in P’. Since only edges between vj+ı and the 
vertices in Y are contained in F4; but not in Ey, it holds that z € Y. Hence, 
it holds that {v;,2} € E, and thus 


De, (2) < De, (vj) +1 < Da. (%j+ı) +1= Dayal) 


Finally, let P” be a shortest monotone s-vj-path in Gg and let P” = P” e P|z,t]. 
Since P” is a monotone s-t-path of length at most Dg,,,(t) in Ge, we ob- 
tain Deus (t) = Do, (t). 

This concludes the proof as we have shown that the sought sequence of graphs 
is finite and how to obtain each graph in it from the previous. 


Using Lemma 4.2, we now provide the main result of this chapter, that is, a 
dynamic program that solves LENGTH-BOUNDED CUT on proper interval graphs 
in polynomial time. The dynamic program stores for each vertex v € V \ {t} 
and each possible distance d the minimum size of a cut that makes each vertex u 
with bu > by or u = t have distance at least d from s. 


Theorem 4.4. LENGTH-BOUNDED CUT can be solved in O(n? - m) time if the 
input graph is a proper interval graph. 


Proof. We prove the statement by developing a dynamic program. We first 
state some general observations and derive from them the main idea behind the 
dynamic program. We then show how the entries of the table of the dynamic 
program are computed and how to compute the solution for LENGTH-BOUNDED 
CUT from the filled table. We continue with proving the correctness of our 
algorithm and conclude with analyzing its running time. 
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We assume that, in the input graph G := (V, E), there is no s-t-cut of size 
at most 8 as this cut can be detected in O(n - m) time [FF56] and the an- 
swer for LENGTH-BOUNDED CUT is then always yes. Thus, dega(s) > 8 
and deg.(t) > 6 as otherwise the set of edges incident to s or t are an s-t- 
cut of size at most 3. Furthermore, by Lemma 4.1, we can assume that there 
is no vertex v with fy < bs or by > fi. By Lemma 4.2, we can assume that 
we search for a solution in which for all v;, v; € V \ {s,t} with i < j it holds 
that dist(s,v;) < dist(s,v;). Hence, we construct a table T: V x N —> N which 
stores for each vertex v; € V \ {s,t} and each possible distance 2 < d < A the 
minimum number of edges that have to be deleted from G” := (V’, E’) := G — {t} 
to ensure the following. First, dist(s, v) < dist(s, ve) for all k < £ < i. Second, 
each vertex v; € V \ {s,t} with j > i has distance at least d from s. 


We start with showing how to initialize the table T. Note that v1, v2,..-, Vaeg(s) 
are neighbors of s and Ugeg(s)4+1) Udeg(s)-+2>+-->Un—2 are not. Hence, to increase 
the distance of each vj with j > i for some given i to at least two, one has to 
delete all edges between s and the vertices in {v; |i < j < deg(s)}. Thus, the 
table is initialized with T[v;, 2] = 0 for all i > deg(s) and T[v;, 2] = deg(s) —i+1 
for all i < deg(s). We further initialize T[v,,d] = deg(s) for all d > 3. 

We next show how to compute the solution to LENGTH-BOUNDED CUT 
once the table T is completely filled. Since we seek a solution F such that 
in H := (V, E \ F) it holds that distzr(s,t) > A, each vertex u € Nyg(t) has to 
satisfy dist y (s, u) > A. Note that deg“(t) > 8 and hence there is at least one ver- 
tex v € Ny(t). Thus, to compute the solution for LENGTH-BOUNDED CUT, we it- 
erate over v € Na(t)\{s} and compute Tv, A] + Hu € N(t) | u = s V bu < by} |. 
Note that this corresponds to the statement that each neighbor u of t in G 
has distance at least A from s or the edge {u,t} was removed. Hence, the 
distance between s and t in the resulting graph is at least A + 1. Further, if 
we take the minimum value over all iterations and compare it to ß, then this 
solves LENGTH-BOUNDED CUT. 


It remains to present the recursive formula for T, to prove the correctness of 
our dynamic program, and to analyze its running time. We start with showing 
how to compute T. For the sake of simplicity, we also store, for each table 
entry T[v;,d] with d > 2, in a second table S[v;, d] the vertex v; € V\{s,t} with 
minimum j such that v; has distance d—1 from s in some solution corresponding 
to T[v;, d]. We initialize S[v;,2] = vı for all v; and S[v,,d] = vı for all d > 3. 
Note that S[v;,d] = vı might not represent what we claimed if we seek to 
remove the edge {s, vı}. However, in this case there is no solution as we assume 
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that deg(s) > 6. For increasing values of d > 3, we iterate over 2<i<n-2 
and compute 


Tlv;, d] = min{T[v;, q= 1] + IC[S[v;, d— 1], Uj, vil}, 
j<i 
S[v;,d] = vj with j = min{arg min, .,{T[v;,d— 1] + |C[S|v;,d — 1], vj, vs] }}, 


where C[va, vj, vi] is a function that represents for each triple (vp, vj, vi) of 
vertices with h < j < i the set of edges between a vertex ve with h < £ < j and 
a vertex v, with r > i. For technical reasons we exclude s and t here and hence 
the formal definition is 


Clun, vj, vi] = {{ve, ur} EEļh<lL<j<i<r}. 


The vertex vp is used to avoid double counting. 

We continue by proving that S and T store exactly what they are supposed to. 
Assume towards a contradiction that there is a vertex v; and a distance d > 2 
such that S[v;,d] or T[v;,d] were computed incorrectly. Then there is also 
a smallest d such that there is a vertex v; for which S[v;,d] or T|v;,d] are 
computed incorrectly and we assume that v; is the vertex with the smallest 
index i such that S[v;,d] or T|v;,d| is computed incorrectly. Since we have 
already shown that the initialization for d = 2 is correct, we focus on the 
case d > 2 and distinguish between the three cases that S[v;,d] was computed 
incorrectly, that T[v;,d] > c, or that T|v;,d] < c, where c is the correct value 
of T vi, d] . 

If Tiv;,d| < c, then let vj be a vertex with j < i that minimizes the 
sum T[vj,d— 1] + |C[S[v;,d— 1], vj, u;]|. Since we assume that T|v,,d — 1] 
is computed correctly (recall that d was chosen to be the minimum value for 
which S$ or T was computed incorrectly), there is a set F; of T[v;,d— 1] edges 
such that in the graph H’ := (V’, E’ \ Fi) it holds that dist z (s, vr) > d — 1 for 
allr > j and dista (s, ve) < dista (s, vg) for all £ < k < j. Let vn = S[v;, d — 1]. 
Since S[v;,d — 1] is by assumption computed correctly, it holds for all ve 
with £<h that dista (s,ve) < d—3. Thus, F; contains all edges between 
vertices in {vg |< h} and {v, |r >i}. Since Clun,v;,vi] is the set of all 
edges between vertices ve with h < V < j to vertices vp with r > i, it 
holds that there is no edge between a vertex of distance at most d — 2 from s 
to a vertex uv, with r > i in H:=(V’,E’ \ (Fi UClun,v;,v;])). Hence each 
such vertex v, is of distance at least d from s in H. It remains to show 
that dista(s, ve) < disty(s, vg) for all £ < k < i. Note that H and H’ only 
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differ in edges in Clup,v,;, vi], that is, in edges between vertices ve and vr 
with h < £< j andr >i. Since those vg have distance d — 2 and those vy have 
distance at least d— 1 from s in H, it holds that dist g(s, ve) = dist g(s, ve) for 
all Z < i and thus also H fulfills dist7r(s, ve) < distzr(s, vg) for all £ < k < i. 
Since T[v;, d] = |Fı U Clun, vj, vi]| and Fi N Cun, vj, vi] = Ø, it holds that 


Tlv;,d] = [Fi ot IClun, vj, va] | = Tiv;,d- 1] + IC[S[v;, d — 1], v5, val], 


and thus T[v;, d] > c, a contradiction. 

If T[v;,d] > c, then there is a cut F’ that contains less than T[v;, d] edges 
such that in the respective graph H’ := (V’, E’ \ F’) all vertices v, with r > i 
have distance at least d from s and dist g (s, vg) < dist (s, ve) for all k < £ < i. 
Then, there is a vertex vj such that vj and all vertices vp with r > j have 
distance at least d — 1 from s in H’ and all vertices vg with £ < j have 
distance at most d — 2 from s in H’. Hence, F’ has to contain all edges 
in F” := {{vg, vr} EE | L< j <i<r}as otherwise a vertex v, with r > i 
would have distance at most d — 1 from s in H’. Let vp := S[v;,d — 1]. We 
partition the set F” into two disjoint sets FY := {{ve,ur} eE|t<h<i<sr} 
and FR :={{vw,us}e BE | h<l<j<i<r}. Let H = (V', E'\ (F’\ Fp)). 
Notice that H and H’ only differ in edges in F%, that is, in edges incident to 
vertices ve and vr with h < Z < j and r >i. Since those vg have distance d — 2 
and those v, have distance at least d—1 from s in H, the distance between s and 
vertices ve with £ < j is the same in H’ and H. Hence, dist y (s, vg) < dist g(s, ve) 
for all k < £ < j. Thus, it holds that T[v;,d—1] < |F” \ Fg] as T[vj,d — 1] was 
computed correctly by assumption. Note further that F% is by definition equal 
to C[S[v;, d— 1], Vis vj] and 


T[v;, d — 1] + |C[vn, vj, val| > min{T[v;, d — 1] + |C[S[vj, d — 1], v4, %] |}. 
J t 


Thus, c = |F'| = |F\ FR + |FR| > Tlo; d — 1] + |C[vn, vj, vill] > Ts, d], a 
contradiction. 

Finally, assume towards a contradiction that $[v;,d] is computed incorrectly 
but T[v;, d] is computed correctly. Since T[v;,d] is computed correctly, there is 
a set F” of T[v;, d] edges such that in H = (V’, E’ \ F’) it holds for all k < £ < i 
that dista (s, v) < dista (s, ve) and for all j > i that dista (s, vj) > d. Then, 
there is a vertex v; such that dist(s, vg) < d—2 for all £ < j and dist(s,v,) > d—1 
for all r > j. Let, without loss of generality, F’ be a set of edges such 
that there is no edge set F” with the same property as described above 
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where the respective vertex vy satisfies j’ < j. Let S[v:,d] = vn. We show 
that h = j. If j < h, then T[vp,,d — 1] + |C[S[vn,d- 1], vn, vil] < Tui, d] 
as otherwise h would have been chosen smaller. This, however, contradicts 
the assumption that T[v;,d] is computed correctly. If j > h, then by defini- 
tion, T[v;, d] = T[va, d — 1] + |C[S[un, d — 1], un, vi]|. Thus, there is a set F” 
such that |F”| = T[v;, d] = |F’| and in H’ := (V’,E’ \ F”) it holds for 
all k < £< i that dist g(s, vg) < dist (s, ve). Moreover, it holds for all j > i 
that dista (s, vj) > d, for all £ < h that dista (s, ve) < d — 2, and for all r > h 
that dist(s,v,) > d— 1. This contradicts the definition of F”. 

We conclude this prove with analyzing the running time of our algorithm. 
We first show how to compute C[vp, vi, vj] for all triples (va, vj, vi) of vertices 
in O(n? - m) time. To this end, we first compute a tables A[v;, vi], where 


Alv;, vi] = Hu, ur} Ee B| 2 <7 <i<r}). 


Note that A can be computed in O(n?-m) time by iterating over all edges {vy, vr} 
(we assume £ < r) and all entries in A[v,,v;] and if £ < j <i <r, then increment 
the entry. Once A is computed, we compute Cup, vj, vi] = Alv;, vi] — Alun, vil 
in constant time per table entry. Since there are O(n?) table entries, the overall 
running time for this preprocessing is O(n? - m) (note that the input graph is a 
connected interval graph and hence O(n) C O(m)). 

Each table entry S[v;,d] and T[v;,d] can be computed in O(n) time by 
iterating over at most n vertices and computing the sum of a table entry in T 
and the size of a table entry in C, thereby keeping track of the minimum value 
and which iteration led to this minimum. Since there are O(n - A) table entries, 
the overall running time is O(n? - A). As we may assume that \ < n (each path 
has length at most n), the running time is bounded by O(n?). Lastly, computing 
the solution takes O(n) time as we have to iterate over up to n neighbors v; of t 
and for each we have to compute |{ve | £ < i A {ve,t} € E}|. This computation 
takes constant time as we can compute the smallest index j of a vertex that is 
adjacent to tin G and then compute i — j + 1. Thus, the overall running time 
for our algorithm is O(n? +m). 


The main point in the proof of Theorem 4.4 where we need to assume that 
the input graph is a proper interval graph and not an interval graph is the 
application of Lemma 4.2. In the following section, we will investigate problems 
arising when trying to adapt Lemma 4.2 for interval graphs. Concluding this 
section, we want to emphasis the way the solution is computed in the proof of 
Theorem 4.4 once the tables T and S' are completely filled. Rather than looking 
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at a single entry or taking the maximum or minimum entry in a given column, 
we iterate over all entries with d = \ in T[v;,d] and add to it the number of 
neighbors v; with j < i oft. The solution is then corresponding to the minimum 
such sum. This way of finding a solution goes to show that each of the four 
guiding questions (even the one that looks the simplest) can have a surprising 
or non-trivial answer. 


4.3 Falsifying Assumptions for Interval Graphs 


In this section, we discuss some problems that arise when trying to adapt the 
algorithm behind Theorem 4.4 for interval graphs. The only difference between 
interval graphs and proper interval graphs is that in interval graphs there can 
be pairs (v, w) of vertices such that N[v] C N[w]. Intuitively, it does not seem 
to make sense to remove an edge {u,v} while leaving an edge {u, w} in the 
solution graph as each shortest s-t-path containing v and using the edge {u,v} 
can then be replaced by a path containing w and {u,w}. This leads to the 
following conjecture. 


Conjecture 4.5. Let G = (V, E) be an interval graph and let F be a set of 
edges. Let d be the distance from s to t in G’ = (V,E\F). Then, there 
is a set F’ of edges with |F"| < |F| such that for G” := (V, E \ F’) it holds 
that distar(s,t) > d and for each v,w E V \ {s,t} with N[v) C N[w] it holds 
that if {u,v} € F’ for some u € V, then also {u, w} € F”. 


Conjecture 4.5 would be helpful to show that Lemma 4.2 also holds for 
interval graphs. Unfortunately, Conjecture 4.5 is false as shown in the example 
in Figure 4.3. Therein, the only solution for removing three edges deletes some 
edges incident to w and one edge incident to v such that the only remaining 
path between s and t passes through both v and w. A next natural conjecture 
could be that a similar approach to the dynamic program behind Theorem 4.4 
could still work, where we order the vertices by their b- or their f-values. 


Conjecture 4.6. Let G = (V, E) be an interval graph and let F be a set of 
edges. Let d be the distance from s tot in G’ := (V,E\ F). Let Fy and Fy 
be sets of minimum size such that Gy := (V, E \ Fp) and Gy := (V, E \ Fẹ) ful- 
fill distg, (s, t) > d, dista, (s,t) > d, and the following. For each v, w € V \ {s,t} 
it holds that if by < by, then dista, (s,v) < dista, (s, w). Moreover, if fy < fw, 
then dista, (s, v) < dista, (s, w). Then, |F| < |F| or |Fy| < |F]. 
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Figure 4.3: An interval graph and its interval representation. The dashed edges 
show the unique solution for LENGTH-BOUNDED CUT with 6 = 3 and A = 5. Note 
that N[v] C N[w], that the edge {u,v} is dashed, and that the edge {u, w} is not. 
Since the dashed edges are the only solution, Conjecture 4.5 is false. 


Figure 4.4: An interval graph and its interval representation. The dashed edge 
is the unique solution for LENGTH-BOUNDED CuT with 6 = 1 and A = 5. 
Let G’ denote the graph without the dashed edge. Note that it holds in G’ 
that dista (s, u) < dista (s, £) = dista (s, y) < dista (s, v) but by < by and fu < fe. 
Note further that the dashed edge is the only edge whose removal increases the distance 
between s and t and hence Conjecture 4.6 is false. 


Again, Conjecture 4.6 is false as shown in Figure 4.4. The idea behind this 
counterexample is to include short intervals with only two neighbors that prevent 
any reordering of their neighbors after the removal of some edges. This shows 
that the basic idea of our algorithm for proper interval graphs cannot work for 
interval graphs as we cannot order the vertices by their b- or their f-values for 
the dynamic program. Moreover, note that the dashed edge in Figure 4.4 is the 
only solution and that after removing this edge, the resulting graph contains 
a C4 induced by the vertices u,x,v and y. Thus, we cannot even assume that 
removing a solution from the input interval graph yields an interval graph. This 
is in contrast to Theorem 4.4 where the graph resulting from removing a solution 
from the input proper interval graph is again a proper interval graph. 
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4.4 Concluding Remarks 


In this chapter, we studied LENGTH-BOUNDED CUT in the special case where the 
input graph is a proper interval graph and showed polynomial-time solvability. 
This confirms a conjecture by Bazgan et al. [Baz+19]. A natural next step is 
to investigate interval graphs. We showed some limitations for adapting our 
approach from proper interval graphs to interval graphs. We still conjecture that 
LENGTH-BOUNDED CUT on interval graphs should allow for a polynomial-time 
algorithm. 

Bazgan et al. [Baz+19] provide a hierarchy of parameters with known results 
and open problems for LENGTH-BOUNDED CUT. In the paper on which this 
chapter is based, we solved some of their open problems [BHK20]. Tackling the 
remaining ones is left as a challenge for future research. 
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Chapter 5 


Disjoint Shortest Paths 


This is the final chapter in the dynamic-programming part of the thesis. With 
regards to content, our main contribution in this chapter is an XP-algorithm for 
the NP-hard k-DISJOINT SHORTEST PATHS problem. This is a variant of the 
fundamental and well-studied combinatorial problem k-DISJOINT PATH. On a 
conceptual level, we will complete our journey through the intricacies of dynamic 
programming. For the dynamic program we develop in this chapter, there is a 
very simple way of computing each table entry. However, when we consider a 
natural generalization of our problem, then this way is not feasible any more. 
Further investigating the generalization yields another way of computing each 
table entry that will turn out to be much more efficient even for the special case 
we are mainly interested in. 

k-DISJOINT PATH describes the question of whether there are k pairwise 
disjoint! paths between vertex terminal pairs (Si,ti)ieprj in a given undi- 
rected graph G. Karp [Kar75] showed that the problem is NP-hard when k 
is part of the input. On the positive side, Robertson and Seymour [RS95] 
provided an algorithm running in O(n?) time for any constant k. Later, 
Kawarabayashi et al. [KKR12] improved the running time to O(n?)—again 
for fixed k. On directed graphs, in contrast, the problem is NP-hard even 
for k = 2 [FHW80]. However, on directed acyclic graphs (DAGs), the problem 
becomes again polynomial-time solvable for constant k [FHW80]. 

We study a variant called k-DISJOINT SHORTEST PATHS. Therein, all paths 
in a sought solution have to be shortest paths between the respective terminal 
pairs. This problem has applications in transportation networks, circuit layout, 
and circuit routing (see e. g. the work by Kawarabayashi et al. [KKR12] and 


!Here and in the following this means vertex-disjoint. 
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references therein) and its complexity for constant k has been a long-standing 
open problem [Eil98, Fom+19]. Very recently, Lochet [Loc21] settled this 


question by showing that k-DISJOINT SHORTEST PATHS can be solved in n? («") 
time, that is, polynomial time for every constant k. We provide a new approach 
with a novel geometric perspective that simplifies many arguments and leads 
to an overall streamlined algorithm with a running time of O(k- n165 K!'+k+1), 
Notably, k-DIsJOINT SHORTEST PATHS is W/1/-hard with respect to k and, 
assuming the ETH, there is no f(k) - n°\")-time algorithm for k-DISJOINT 
SHORTEST PATHS [Ben+21]. The asymptotic gap between the lower bound 
of n°) and our upper bound of nY(*+V))) is, however, still quite large. 

We formalize our novel geometric view for k-DISJOINT SHORTEST PATHS and 
provide some structural observations regarding solutions to k-DISJOINT SHORT- 
EST PATHS in Section 5.2. In Section 5.3, we present a dynamic-programming- 
based approach to solve a special case of k-DISJOINT SHORTEST PATHS. After- 
wards, we provide our algorithm for the general problem that uses the dynamic 
program as a subprocedure and prove our main theorem. 


5.1 Problem Definition and Related Work 


k-DISJOINT SHORTEST PATHS is defined as follows. 


k-DISJOINT SHORTEST PATHS 

Input: An undirected graph G = (V, E) and k pairs (s;,t;);ejr] of vertices. 

Question: Are there k disjoint paths P; such that, for each i € [k], P; is a 
shortest s;-t;-path? 


Eilam-Tzoreff [Eil98] introduced this variant of k-DISJOINT PATH, showed that 
it is NP-hard when k is part of the input, and provided a dynamic-programming- 
based O(n®)-time algorithm for 2-D1sJOINT SHORTEST PATHS. This was later 
improved to an O(n?m)-time algorithm [Ben+21]. The O(n®)-time algorithm 
for 2-DISJOINT SHORTEST PATHS works for positive edge lengths and, recently, 
Gottschau et al. [GKW19] and Kobayashi and Sako [KS19] independently ex- 
tended this result by providing polynomial-time algorithms for the case where 
the edge lengths are non-negative. Concerning directed graphs, Berczi and 
Kobayashi [BK17] provided a polynomial-time algorithm for positive edge 
lengths for 2-DISJOINT SHORTEST PATHS. Note that setting all edge length to 
zero results in 2-DISJOINT PATH on directed graphs, which is NP-hard [FHW80]. 
Extending the problem to finding two disjoint s;-t;-paths of minimal total length 
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(in undirected graphs), Björklund and Husfeldt [BH19] provided an O(n1")-time 
randomized algorithm. Finally, Tragoudas and Varol [TV96] showed that it is 
NP-hard to decide whether the number of solutions of an instance of 2-DISJOINT 
PATHS is at least some given threshold. 


5.2 A Geometric View on Shortest Paths 


In this section, we delineate our geometric perspective on k-DISJOINT SHORTEST 
PATHS, make some structural observations, and give a characterization of 
solutions with regard to their geometry. In the following sections, we will 
then use these observations to design a dynamic-programming-based algorithm 
for k-DISJOINT SHORTEST PATHS. We start with some basic intuition and 
a small example. In Subsection 5.2.1 we then formalize the geometry-based 
ideas and provide a characterization of solutions for 2-DISJOINT SHORTEST 
PATHS. In Subsection 5.2.2 we then generalize this characterization to solutions 
of k-DISJOINT SHORTEST PATHS. 

For the geometric representation, we define a k-dimensional vector? Y for 
each vertex v. The it" coordinate of this vector is the distance between s; 
and v. An example of this vector representation is given in Figure 5.1. Note 
that there can be multiple vertices with the same vector. The geometric 
perspective is based on the following two observations. First, since each 
path P; = (vj, vj,--.,v4,) in the sought solution is a shortest s;-t;-path, it 
holds for each j € [d;] that dist(s;, v5) = dist(s;,v;_,) +1, that is, the path P; 
is strictly monotone in the i” coordinate. We say that paths which are 
strictly monotone in the it! coordinate have color i. Second, for each vertex vi 
in P;, it holds that dist(s;, ti) = dist(s;, v$) + dist(vį, t;). Thus, any vertex w 
with dist(s;,t;) A dist(s;, w) + dist(w, ti) cannot be part of a shortest s;-t;-path. 
We can formulate this into a necessary (but not sufficient) condition in terms of 
our geometric perspective as follows. Consider the k-dimensional hyperrectangle 
that has the vectors of s; and t; as two corners and whose sides form an angle 
of 45° with the coordinate axes. We say that this hyperrectangle is spanned 
by s; and t;. The (hyper)rectangle spanned by sı and tı in the right-hand side 
of Figure 5.1 is highlighted in gray. We will prove that any vertex whose vector 
is not within the area of this hyperrectangle cannot be part of a shortest si-ti- 
path. Moreover, if we consider any vertex vi in P; and the two hyperrectangles 


2We use the term vector interchangeably with the point the vector is pointing to. 
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Figure 5.1: Left side: A simple, undirected graph with four distinguished ver- 
tices $1, Sa, tj, and tg. The vectors with the distances to each s; are written next to 
the vertices. Two disjoint shortest paths are highlighted. 

Right side: A two-dimensional coordinate system. Each vertex is represented at its 
vector and edges are drawn as lines between their respective end points. When multiple 
vertices share the same vector, then vertices are depicted close to their actual vector. 
The rectangle spanned by sı and tı is drawn in gray. The two disjoint shortest paths 
are again depicted. Note that the sı-tı-path is going strictly monotone to the right 
and the s2-t2-path is strictly monotone going down. 


spanned by vi and either s; or t;, then it holds that the vector of each vertex 
in P; is contained in the area of these two hyperrectangles. 

We use these two observations as follows. Assume that there is a solution (a 
set of pairwise disjoint shortest s;-t;-paths (P;);ejx]). We will show that each 
path P; can be split into £; subpaths P}, P?,..., P“ such that 


1. 4 < f(k) for all i € [k] and for some computable function f, 


2. the last vertex in PÍ is the first vertex in pe for each i € [k] and 


each je [¢— 1], and 
3. the subpaths can be partitioned such that 
e subpaths in the same part of the partition share a common color and 


e for two subpaths PÍ and P} in different parts of the partition, it holds 
that the areas of the hyperrectangles spanned by the end vertices 


of P? and PA, respectively, are disjoint. 
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Our algorithm works in two phases. In the first phase, we guess? the end 
vertices of each of the described subpaths (we call the end vertices marbles). In 
the second phase, we compute the described partition and solve k-DISJOINT 
SHORTEST PATHS independently for each part of the partition. For each part of 
the partition, there is a color c that all subpaths in this part have. We assume 
that each subpath in this part is strictly increasing in its c-coordinates as we 
can otherwise swap the two endpoints. We then ignore all edges that are not 
monotone in c (the two endpoints of the edge have the same c-coordinate) and 
direct the edges so that they are pointing towards the higher c-coordinate. Note 
that the resulting graph is a DAG and since each subpath is strictly increasing 
in its c-coordinates, the directed version of each subpath is still contained in the 
constructed DAG. Hence, we can use the algorithm of Fortune et al. [FHW80] 
for k-DISJOINT SHORTEST PATHS on DAGs to find pairwise disjoint shortest 
subpaths. We then also present our own dynamic program for k-DISJOINT 
SHORTEST PATHS on DAGs and, as Fortune et al. [FHW80] only state nO) 
time, provide a precise running-time analysis. 

We remark that Lochet [Loc21] used the same two-step approach but how 
these steps are achieved is different. In particular, he does not use a geometric 
view on shortest paths (as we do). As a result, even for k = 2 he can only 
upper-bound the number of vertices his algorithm has to guess to ensure that no 
two parts can intersect by 9°! ([Loc21, Lemma 13]) while our approach produces 
at most five parts. Moreover, our geometric view allows us to use a more efficient 
way of splitting the paths for general k (in O((k + 1)!) parts instead of O(k®  ) 
as done by Lochet). 

We continue with some intuition for the described subpaths and the partition 
of them. We start with the two-dimensional case and distinguish between 
four cases. Figure 5.2 gives an overview over these cases and the vertices (the 
marbles) we guess in each case. It is easy to see that only in the case in the 
top right-hand corner the areas of two subpaths intersect (the dashed line). 
However, in this area both paths are strictly monotone in both coordinates. 
Thus, the depicted marbles ensure in each case that a partition as described 
exists. 

We continue with the case where k > 2. The basic idea is to recursively 
partition the paths by considering two-dimensional projections of the respective 
hyperrectangles. Note that if these projections are disjoint, then also the areas 


3Whenever we pretend to guess something, we mean that we iterate over all possible choices 
and consider for the explanation or proof the respective correct iteration. 
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Figure 5.2: The four cases for the two-dimensional projection of two paths P, and Ps. 
The thick black lines represent Pı and P2 and the colored rectangles are the ones 
spanned by the respective terminals and marbles. For easier distinction, we colored 
everything related to the sı-tı-path red and related to the s2-t2-path blue (except 
for the respective paths themselves). A black square represents a vector on which 
we guessed a marble on both paths. The dashed line represents the subpaths of P, 
and P that have a common color. 

(top left-hand corner): The lines cross in one point with non-integer coordinates. 
(top right-hand corner): The lines cross in at least one point with integer coordinates. 
(bottom left-hand corner): The rectangles defined by s; and t; intersect (in the gray 
(darker) area), but the lines do not. 

(bottom right-hand corner): The rectangles defined by s; and t; do not intersect. 


of the respective hyperrectangles are disjoint. For each pair (P;, P;) of paths, 
we start with the orthogonal projection to the coordinates 7 and j. This yields 
a set of marbles for P; and P; such that the respective subpaths either have 
colors i and j or cannot interfere with the respective other path. Assume that we 
guessed for each two-dimensional (i, j)-projection the intersection of P; and Pj 
in this projection. Unfortunately, we cannot partition the respective subpaths as 
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stated above. Instead, we store for each subpath P/ of P; the set ® of all colors 
that P/ has and recursively refine these subpaths until a partition as stated is 
possible. Roughly speaking, we check for each pair of subpaths whether the 
areas of their respective hyperrectangles intersect and if they do, then we find 
a two-dimensional projection and use this to find new marbles. The resulting 
subpaths are then either disjoint from the respective other path or have an 
additional color. We continue this procedure until the areas of any two subpaths 
with different colors are disjoint. These subpaths are then partitioned by their 
respective sets of colors. Note that by construction different subpaths in one 
part of the partition share a common color and the areas of the hyperrectangles 
spanned by the end vertices of two subpaths in different parts of the partition 
are disjoint. 

In Subsection 5.2.1 we formalize the geometry-based ideas and provide a char- 
acterization of solutions for 2-DISJOINT SHORTEST PATHS. In Subsection 5.2.2 
we generalize this characterization to solutions of k-DISJOINT SHORTEST PATHS. 


5.2.1 Two Shortest Paths 


We now formalize and generalize the idea behind the geometric view (visualized 
in Figures 5.1 and 5.2). We start with some notation for projections. For 
any Ø C IC [k] and any vector x € R*, we denote with x! € RII the orthogonal 
projection of x to the coordinates in J. That is, x’ is the |/|-dimensional 
vector obtained by deleting all dimensions in x that are not in J. We usually 
drop the brackets in the exponent, thus writing e. g. (5,6,7,8,9)!%+ = (5, 7,8) 
or (5, 6,7)? := (6). Similarly, for R C R* we define R! := {x! | x € R} CR". 

We associate with each vertex v € V a vector in the k-dimensional Euclidean 
vector space. Formally, Y := (¥") etx) = (dist(si, v) iej) € IN* and for U C V 
we use U := {wu | u € U} to denote the set of all vectors of vertices in U. For a 
given instance of k-DISJOINT SHORTEST PATHS, one can compute the vector 
of each vertex in O(km) time by performing a breadth-first-search from each 
vertex si. 

We use the following notations for any non-empty index set I C [k] in order 
to compare vectors of vertices v, w or sets V, W of vertices: 


volw > veel. vxw for ~ € {<,<,=,>,>}, and 


V W 4 {0 lve V} {È |wew} for ~e {c,C,=,2,>}. 
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We further write x €! X if there is an 2’ € X with 2! =! x and x ¢! X 
otherwise. 


Lemma 5.1. For any pair of vertices v,w € V, we have || ¥—Wlloo < dist(v, w). 


Proof. Let P be a shortest v-w-path. Each edge {p,q} in P fulfills |? — 7 |» < 1 
as | dist(s;, p) — dist(s;,q)| < 1 for each vertex si. Thus, by the triangle inequal- 
ity, |? u Ùll & ae A(P) L= dist(v, w). 


For two vertices u, w € V, let 
uow = {v € V |dist(u, v) + dist(v, w) = dist (u, w) } 


be the set of all vertices that lie on a shortest u-w-path. Similarly, for 
any x,y € IF, let 


zoy:={zeR* | ||x— 2] + Illz — yll = liz - yllo} 


be the hyperrectangle spanned by x and y (whose sides form an angle of 45° 
with the coordinate axes (see Figure 5.1)). We continue with a formal definition 
of colors. 


Definition 5.1. Let s,t be two vertices and let P be a shortest s-t-path. The 
pair (s,t) and the path P are colored if dist(s,t) = || — t ||o. Let 


O(P) = C(s,t) = {e € [k] | [27° - f°] =F - lo} 


be the set of all colors of P. The pair (s,t) and the path P are c-colored for 
each c € C(s,t). 


Note that this definition of a c-colored path is equivalent to saying that P is 
strictly monotonous in its c-coordinates. Note further that for arbitrary u, w € V 
we do not always have usw C wo WÙ, that is, the vectors of all vertices on a 
shortest u-w-path are not necessarily contained in the set of vectors “spanned” 
by @ and wÙ. However, this inclusion holds for colored vertex pairs as shown 
next. 


Lemma 5.2. Let v,w € V be a b-colored pair. Then, vow C Vow. 


Proof. Without loss of generality v <? w. Let u be an arbitrary vertex in vo w. 
Then, dist(v, w) = dist(v, u) + dist(u, w). Definition 5.1 and Lemma 5.1 yield 


© = 7’ + dist(v,w) = Y’ + dist(v, u) + dist(u, w) > 7° + dist(u, w) > @. 
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Hence, V’ = 7° + dist(v, u) and W’ = T’ + dist(u, w). Lemma 5.1 then states 
that dist(v, u) = |V — Ul. and dist(u, w) = ||% — Ul. Hence, 


| — üll = dist(v, w) = dist(v, u) + dist(u, w) = ||F — loo + |T - Bloc. 


> => 


This leads to Ñ € V o Ù and thus vow C Vow. 


Throughout this chapter, we will be particularly interested in two-dimensional 
projections of areas Y © wÙ for some vertices v and w. Note in this context 
that (7 o È) = 7! o T’. Recall that the area defined by z oy for x,y € N? 
is a rectangle in the plane whose sides form an angle of 45° to the coordinate 
axes. The following lemma lists necessary and sufficient conditions for those 


rectangles to intersect. 


Lemma 5.3. Let x,y,2,9 € N?. ThenzxoyNtog FO if and only if all of the 
following hold: 


(i) min{x! — x,y! — y?} < max{2! — 27, 9° — 9°}, 


(it) min{@* Zu ag" = 3°} < max{x! u na yo y} 


(iii) min{a! + 2, y! +y?} < max{â +2’, 9" +97}, and 


(iv) min{&! + #7,9° +97} < max{x! +r? y! + y?}. 


Proof. Let Ri, Ro C R? be two axis-parallel rectangles defined by the opposite 
corners q,r E€ R? and q,* € R?. It is easy to see that Rı and Rə intersect if 
and only if 


a) min{q!,r!} < max{g',?'} and min{g',r#'} < max{q!,r!} 
(i.e., there is an overlap in the first coordinate), and 


b) min{q?,r?} < max{q’, 77} and min{@?,#?} < max{q?,r?} 
(i.e., there is an overlap in the second coordinate). 


Since the intersection of two rectangles is invariant under rotation and scaling, 
we simply rotate roy and #04 by 45° (and scale it by factor Y2) by multiplying 
all vectors with the matrix 
R= f P 
1 ip 


Now the above characterization for axis-parallel rectangles translates into the 
conditions stated in the lemma. 


69 


Figure 5.3: The rectangle xo y spanned by two points x and z in two dimensions 
and a point z € coy. Note that ||y — ||0 is the vertical distance between x and y. 
Lemma 5.4 states that dı > da and d3 > da. 


The next lemma states that the distance between x and any z € xoy where x 
and y have distance |y° — x°| is at most |2° — x“|. Intuitively, this is clear as roy 
is a hyperrectangle whose sides form an angle of 45° with the coordinate axes 
and hence half of its borders exactly define all points z whose distance to x is 
exactly |2° — x°|. See Figure 5.3 for an illustration. 


Lemma 5.4. Let b,c € [k] and let x,y € N! with ||y - &||o = y® — x°. Then, 
for all z € xoy it holds that 2 — x° > |z? — x°| > 0 andy — z¢ > |y? — z*| > 0. 


Proof. By assumption and the definition of xo y, it holds that 


yo = 2° = lly = sll = lly — zoo + ll = Aloo 
> lye = 2°] + |e 2°] > (y= 2°) + (2° — 2°) 
= yf p zê. 


Thus, we have equality everywhere, in particular y° — z° = ||y- zļ|ļloœo > |z? — x? 


and 2° — x° = ||z — zll > |y? — z°| (as shown by the equality between the last 
term in the first row and the first term in the second row). 


We next formalize the lines we used in Figure 5.1 to connect the vectors 
of vertices in a path. To this end, for any path P = (vj,v2,...,u;) we de- 
fine C(P) c RE as the piecewise linear curve connecting the points of P in the 
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order given by P. Recall that C(P) denotes the set of all colors a such that P 
is a-colored. The next observation states that ¢(P)°?) is a straight line, which 


is equivalent to the statement that P is strictly monotone in each coordinate 
in C(P). 


Observation 5.5. Let P be a colored path. Then ¢(P)°) is a straight line 
segment. 


Proof. Let €:= ||5£ — tplloo and k’ := |C(P)|. The path P contains exactly £ 
edges, each of which has an Euclidean length of at most Vk’ in the projec- 


tion ¢(P)C). Thus the length of C(P)C(P) is at most £- Wk’ which is exactly 
eee 


the Euclidean distance between 5P° CP) and £ 


As a consequence of Observation 5.5, the intersection of two paths P,Q in the 
(C(P) U C(Q))-projection is also a straight line segment with an angle of 45° 
to the coordinate axes as shown in Figure 5.1 (right side) and Figure 5.2 (top 
right). 


Lemma 5.6. Let P and Q be two colored paths, and C C C(P)UC(Q). 
Then ¢(P)° NC(Q)“ is a (possibly empty) straight line segment. 


Proof. For the sake of notation, we assume that C = [|C|]. Note that (Pa) 
and ¢(P,) are piecewise linear curves. Moreover, according to Lemma 5.2 for 
any two points x,y € ¢(P), it holds that ||x — yllo = |x° — y°| for all c € C(P) 
and ||x’ — y’|loo = |x” — y’| for any two points x’, y’ € C(Q) and all b € C(Q). 
So, for x,y € R* with 2°, y? € C(P)° N¢(Q)", it follows that 


lz° — y°| = [la — yllo = |z” — y?| 


for all c € C N C(P) and all b € CN C(Q). Thus, C(z,y) 2 C and the claim 
follows from Observation 5.5. 


Note that even if ¢(P)O PUC) A c(Q)CPIYC) is non-empty, then it does 
not need to contain points from N!IC(?)¥C(@)! as can be seen in the top left 
example in Figure 5.2. 

The following definition starts to formalize the notion of marbles, that is, the 
special vertices in the different cases in Figure 5.2. We start with the two cases 
in which the lines of P and Q cross (the upper two). 
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Definition 5.2. Let P,Q be two colored paths, let be C(P), and let ce C(Q). 
The paths P and Q are b, c-crossing if the intersection 


X = ¢(P)* n ¢(Q)> 
is non-empty and they are b, c-non-crossing otherwise. 
ee ns # 0, then we define a3°(Q) and w5°(Q) to be the first and last 


—b,c A : 
vertex v of P with v?° € X. If P ° N X only contains non-integer coordinates, 
then a3°(Q) := w° (Q) := L. We further define 9%°(Q) and @%°(Q) to be the 


=b,c 
last vertex before and the first vertex after that intersection. If P’ N X =b, 
b,c b,c b,c b,c 
then Ap (Q) = =Wp (Q) = op (Q) = =wp (Q) = l. 


Regarding notation, we will use ap instead of ab°(Q) (and the same for w, 0, 
and w) if b, c, and Q are clear from the context. Note that the subpaths between 
the respective a- and w-vertices are by Lemma 5.6 straight lines and {b, c}- 
colored. 


Observation 5.7. If P,Q are two paths with a®3°(Q) # 1, then 
Plr (Qhon Q= Qlag’(P),#g°(P)]. 
In particular, both of these subpaths are b, c-colored. 


It remains to consider the subpaths between s- and a-vertices and between w- 
and t-vertices. By Lemma 5.2 these have to lie in the rectangle areas 


5? © 0B (Q), wp (Q) o tp, 58 0 88 (P), and wg (P) otg. (5.1) 


Figure 5.2 (top left-hand corner and top right-right hand corner) suggests that 
these areas are pairwise disjoint. We will show that this is indeed the case 
and, to this end, we show the following two observations. The first one states 
that the -vertex on P has a b-coordinate that is at most the b-coordinate of 
the -vertex on Q, where b is the “original” color of P. Note that this -vertex 
is right before the respective a-vertex or before the single crossing point with 
non-integer coordinates. Since P is strictly b-monotone, the path Q can at most 
increase or decrease as fast as P from the point of intersection. 


Observation 5.8. Let P,Q be two b, c-crossing paths with (Q Q)AL. IfP 
is a subpath of a eho Sp-ty-path and Qisa aubnauh of a shortest s.-t.-path, 


then EQ < be (P P and 38 (P P) <d8(Q) . 
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Proof. Let z € ¢(P)’*M ¢(Q)** have minimal b-coordinate. Note that 


—— b,c 


—— b,c 
| = 


and since ¢(P) is strictly increasing in its b-coordinate, we can infer that 


—b 


— o} —— b,c 
2? — Ap(Q) = z—0Q(P) | > 2° —Ag(P) . 


—— b,c 
ra], - 


—b —b 
This yields p(Q) < Og(P) and the second inequality follows analogously. 


The second observation is a simple but useful restatement of Lemma 5.1. 


Observation 5.9. Let b,c € |k] and let (v,w) be a b-colored pair of vertices 


>C 


with v <? w. Then © — ° > V? — 7°. 


Proof. By Lemma 5.1, @° — 7° < dist(v, w) = @ — 7°. A simple arithmetic 
reformulation yields ° — È’ < 7° — 7” and mulliplyine both sides with —1 
completes the proof. 


We are now in the position to prove the statement that the four areas defined 
in Term (5.1) are pairwise disjoint. 


Lemma 5.10. Let P and Q be two b,c-crossing paths. The sets 


b,c b,c 


(Zo) (oR) (800P), and (Po) 


are pairwise disjoint (or undefined). 


Proof. Without loss of generality, let P be b-colored, Q be c-colored, sp <° tp, 
and sg <° tg. Recall that sp and tp are the start and end vertices of P, 
respectively. We further assume that all above sets are defined, that is, none of 
the described end parta is L. = Lemma 5.4, for any x € oo, and y € ae 


b,c 
it holds that x < pd < wp’ <yP, and thus (7357) N (IR otp)” "=p 


—>»\ be A 
An analogous argument holds for (7 © ap) and 273 pot ip)” 
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b, b 


n (3° 0) “= 0. 
Since all other remaining cases are analogous, this will conclude the proof. By 
Observation 5.9, it holds that 


We will now use Lemma 5.3 to show that (zo dr) 


>b >c >b >E 
max{5p" — 5p°,0p —Op }=Op —Op and 


>b > >b > 
min{33” — 53°, Aq’ - 89°} = ðo — da". 


—b, —b, y z : s 
Observe that Op r # ðQ “ as otherwise Op would lie on the intersection of P 


>b >b > > 
and Q, a contradiction. Hence 0p # 0g or ap. # ðo. In the former case, 
Observation 5.8 states that Og >? Op and in the latter case it states Op >€ Og. 
> >b >b > >b > >b > 
Hence, dp +09 > Op +80 and thus dp —Odp <ög — ðo. 
bc | —b, 
Q 


Setting © = SP”, y= Op’, f=3 


—>»\ b,c = þe 
violates condition (ii) and hence (32° ap) N (33o 0) =. 


>be. 
“, and ĝ = 0g “ in Lemma 5.3 then 


We continue with the definition of marbles for the cases where the two paths P 
and Q are b,c-non-crossing. Figure 5.2 shows that even in this case sp otp 
and sg tg in general are not disjoint (bottom left-hand corner). Since the two 
bottom cases are distinguished by the intersection of sp otp and sg 2 to, we 
start with a definition of this intersection. 


Definition 5.3. Let b,c € [k]. Let P be a b-colored path and let Q be a c-colored 
path. The common b,c-area of P and Q is 


A>¢(P, Q) — (sp otp)°* N (33 tos. 


Note that if A°°(P,Q) = 0, then by Lemma 5.2, they do not share ver- 
tices with common vectors and hence they do not share common vertices. 
If A®°(P,Q) 4 0 and P and Q are b, c-crossing, then we can use Definition 5.2 
to define the marbles. It hence remains to study the case where A®°(P, Q) 4 0 
and P and Q are b, c-non-crossing. In this case we need at most one marble per 
path and this marble is defined as follows. 


Definition 5.4. Let P be a b-colored path, Q be a c-colored path, and let 
without loss of generality be so <? tg. Define 


B:={vEe V |v = so ^v <€ so Uve V |v = to Av > to}. 
Q Q Q Q 
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If PNB #0, then 8%°(Q) is the unique vertex in PN B. If PNB = JQ, 
then 6%°(Q) = L. 


Observation 5.11. 573°(Q) is well-defined. 


Proof. Since P is strictly monotone in the b-coordinate, it can clearly con- 
tain at most one point from Bı := {v € V | v = sg Av <° so} and one 
from Bə := {v EV | v=" to Av >° ta}. It remains to show that it cannot in- 
tersect both sets. To this end, observe that Lemma 5.4 states for any bı € Bı 
and b2 € Ba that |b{ — b5| > Er -thl > lso -t = |b? — b3|, and therefore the 
pair (b1, b2) is not b-colored and thus P cannot contain both bı and ba. 


We next show that if A®°(P,Q) 40 and P and Q are b, c-non-crossing, then 
at least one of the vertices 573°(Q) or 56° (P) exists. Afterwards we will show 
that these vertices guarantee that the respective new areas are disjoint. 


Lemma 5.12. Let P,Q be two b,c-non-crossing paths with AP°(P,Q) £ 0. 
Then, oe (Q) # 1 or 5G (P) #1. 


Proof. Note that the definition of A®°(P, Q) requires that without loss of 
generality P is b-colored and Q is c-colored. We further assume without loss of 
generality that {b,c} = [2] and that sp <’ tp and sg <° to. 

Suppose towards a contradiction that 6%°(Q) = 5G (P) = L. Since P,Q 
are b, c-non-crossing and 5o (P) = L, the path Q cannot cross the curve 


{ze N? | 2° = sp AT? < sp}UC(P)’*U {x € N? | x° = tp Ar? > tp}. 


Since Q cannot cross the line, it is located completely on one side of it. Assume 
without loss of generality that Q (and thus in particular tg) is located on the 
side containing (0,0), that is, for all v in P and w in Q with v? = w? it holds 
that v° > wê. 

Then there are three possible cases: th < sh, th € [sp, th], or th > tp. Note 
that in the first case by Lemma 5.4 it holds for any z € A®°(P,Q) that 


z- to > 2° — th > 2? — sh > 2° — s$, 


a contradiction to z € (59 © tg). In the last case, a similar argument holds 
with 


sb <th- 2° <ta- 2° < sb - 2°, 
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which is a contradiction to z € (sp © Ep), It remains to analyze the case 


where tb € [s%, td]. Note that in this case there is a vertex pin P with p? = t$. 
Q PZP Q 
Hence tg < p°, and pE {v EV |v = to Av >° tg}. Thus 62°(Q) =p4 L,a 


contradiction. 


The next lemma shows that if P and Q are b, c-non-crossing, but they have 
a common area Ap c and 65°(Q) # L, then the area between sg and tọ is 
disjoint from the two areas between sp and 6%°(Q) and between 6%°(Q) and tp. 
Hence 6%°(Q) is the last type of marble needed. 


Lemma 5.13. Let P be a b-colored path and let Q a c-colored path such 
that Ar (P,Q) # L and 6%°(Q) # L. Then, 


— bc — be 
(59 0 ig)” is disjoint from (370 53°) U O © ir) 
Proof. For the sake of readability, we use ô = 63°(Q). We will show that 
= — b,c = =) be 
(sg to) n (32°?) =Í. 


The proof for (6 © tp)>< is then completely analogous. Assume without loss of 
generality that 6 = tg and ô >° tg. Notice that 


=> —b > 
max{5P° — 5p’, ” _ ó} rer 
— =j —c > — 
< ia = to Obat b; * min{3o° = 59 ‚to = to }. 
z —b.c —b,c be —b.c A —b ye 
Setting 2 := sp’, y= 6°, ĉ := 5Q °, and y := t in Lemma 5.3 then 


yields that condition (ii) is violated and tines GLI N (58 oto): = l. 


We are finally in the position to define the set of marbles for a pair of paths. 
Afterwards we conclude this subsection with the main proposition that states 
that marbles uniquely classify solutions. 


Definition 5.5. Let P be a b-colored sp-tp-path and Q be a c-colored sg-tg- 
path. The set of {b, c}-marbles of P with respect to Q is 


ME (Q) = {sP, tP, up (Q) | u € {a,w, 8, w, 8}} \ {1}. 
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The next proposition states that if there are two different solutions and in 
particular two pairs (P,Q) and (P’, Q’) of solution paths with the same marbles, 
then P and Q share exactly the same vectors as P’ and Q’ do. Recall that the 
shared vectors are a straight line segment and that once their ends are fixed, we 
can use the dynamic program by Fortune et al. [FHW80] to find disjoint paths 
between these ends. 


Proposition 5.14. Let P and P’ be b-colored sp-tp-paths, and let Q and Q' 
be c-colored sg-tg-paths. If MÈ: (Q) C P’ and MG (P) C Q’, then 


{ve P’| ve" Q} =" {ue P| ve? Qh. 


Proof. Let R be the subpath of P that starts at aÈ (Q) and ends at wS (Q) 
(or R= 0 if ap = wp = L). From the definition of a and w and Lemma 5.6, it 
follows that {v € P | v €>° Q} =° R. We now consider the two cases whether 
or not P and Q are b, c-crossing. 


If P and Q are b, c-crossing, then by definition 0p # L and wp £ L. It follows 
from Lemma 5.2 that the subpaths of P' from sp to Op and from wp to tp use 
only vectors from 5p © ap and @potp p, respectively. As the analogous statement 
holds for the corresponding subpaths of Q’, it follows from Lemma 5.10 that all 
these subpaths do not intersect in the projection to the b-c-plane. It remains to 
consider the subpath from ap to wp. If ap = wp = L, then 


{fveP’|ve’*Q}=0=R={vEeP|veQ}. 


Otherwise, a is by Lemma 5.6 a straight diagonal line and by Observation 5.5 
so is {v € P’ | v €° Q'}. Since those two straight line segments have the same 
ends, they are the same and thus {v € P’ | v €?° Q'} =" R. 

If P and Q are b,c-non-crossing, then {v € P’ | v e° Q'E C AP°(P,Q) 
and Ø = R. We consider the two cases AP°(P,Q) = @ and A®*(P,Q) 4 0. In 
the former case it holds that 


{ve P|ve**Qh=R=0=A°>(P,Q) = AP®t(P',Q') = {v e P' | v œ Q}. 
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In the latter case, by Lemma 5.12 there is a 65°(Q) 4 L or ba (P) # L. Without 


loss of generality, assume that 65°(Q) # L. Then, 65°(Q) € M%°(Q) C P’ and 
by Lemma 5.13 


>b,c Zs é 2: 7 > P 2 > = Z 
[ve P |v gR} C (goet) h (GP © 8B (Q): U (8L (Q) o ter)?" ) 


( 
= ("N (2 KA)" U (52° (Q) o fF)*) 
0 


Thus, {ve P' | veo" Q'}=0= R= {ve P| ve" Qh. 


5.2.2 More than Two Shortest Paths 


In the previous subsection, we looked at two shortest paths P and Q from sp 
to tp and sg and tg, respectively. We showed that selecting at most ten vertices 
from P and Q (five per path; see Definition 5.5) is sufficient to ensure that 
each pair (P’, Q’) of shortest sp-tp- and sg-tg-paths that also contain these 
vertices (M%°(Q) and M&(P)) “behave” like P and Q in the sense that P’ 
and Q’ intersect in the same vectors as P and Q do (see Proposition 5.14). 
In this subsection, we define a set C,|C| € O(k- k!), that basically ensures the 
same properties for k paths. To formalize our goal for this subsection, we first 
introduce the concept of avoiding paths which is a generalization of a slightly 
modified version of b, c-non-crossing paths. The modification is to ignore the 
ends of P and Q to ensure that we can split paths at certain vertices and still 
can ensure that these different parts are avoiding. 


Definition 5.6 (I-avoiding). Let Ø C IC [k]. Two paths P and Q are I- 
avoiding if p g Q for each inner vertex p of P and q g P for each inner 
vertex q of Q. Two vertex pairs (Sp, tp) and (sq, tq) are I-avoiding if 


>I pPI >I pl pPI >I a 
(5p otp )N (Sq ota ) C {p tp JA {5g ta }- 
Note that being /-avoiding implies being I’-avoiding for all I’ > I. We use 
avoiding as a shorthand for [k]-avoiding. Two paths P, and P, are internally 


vertex-disjoint if neither of them contains an inner vertex of the other path. 
Avoiding paths are clearly internally vertex-disjoint. 


Observation 5.15. Let P,Q be two avoiding paths. Then P is internally 
vertex-disjoint from Q. 


78 


Moreover, for each pair of avoiding vertex pairs (s,t) and (u, w), the short- 
est s-t- and u-v-paths are internally vertex-disjoint. 


Lemma 5.16. Let (s,t) and (u,w) be two colored pairs of vertices. If the 
pairs (s,t) and (u,w) are avoiding, then each shortest s-t-path is internally 
disjoint from each shortest u-w-path. 


Proof. If (s,t) and (u, w) are avoiding, then by definition and Lemma 5.2 
gou Nash cancer 


and thus each shortest s-t-path and each shortest u-w-path only intersect 
in {s,t} O {u, w} and are therefore internally vertex-disjoint. 


With the notation of avoiding pairs, we can formulate our goal for this subsec- 
tion. To this end, fix a solution P = (P;);ejxj for a given instance (G, (si, ti);e[k]) 
of k-DISJOINT SHORTEST PATHS, that is, P; is the s;-t;-path in the solution. 
Essentially, we want to partition the paths in P into subpaths and assign a 
set ® of labels to each subpath (® C [k]) such that the following two conditions 
are satisfied. 


(1.) Let P be a subpath with labels ® C [k]. For each be ®, P is b-colored. 


(2.) Let P and Q be subpaths from P;,P; € P with labels ®p,®g C [k], 
respectively. If Pp # Gg, then (sp,tp) and (sg, tg) are avoiding. 


Note that (2.) will be the central argument in our algorithm for k-DISJOINT 
SHORTEST PATHS. The algorithm guesses the endpoints of these subpaths and 
based on (2.) the algorithm can then compute the inner vertices of subpaths 
with different label sets independently. 

Note that for k = 2 the partition of Pı and P along the sets MÈ: (Q) 
and MG (P) satisfies the above. Each subpath of P;,i € [2], has label i. More- 
over, the subpaths between the a- and w-vertices have both labels 1 and 2. 
Hence, (1.) above is satisfied. Furthermore, (2.) follows from Proposition 5.14. 

We now generalize this to arbitrary constant k. The basic idea behind 
defining a respective set C of marbles is depicted in Figure 5.4. Initially, each 
path P; has label 7. Whenever two paths P; and Pj in the solution intersect in 
the (i, j)-projection (that is, the respective a- and w-vertices are not L), then 
the subpaths P; and P; in the intersection get both labels ö and j. If a third 
path P’ also intersects with P;, then we try to use the intersections to move 
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Py. = g 
$1 ay a wi Wy ty 
3 
T S ee 
S2 a2 W2 t2 
1-2 
P + a š 
83 Q3 ab ws W3 tz 


Figure 5.4: The three main lines represent three paths Pı, P2, and P3. The small black 
rectangles represent marbles on the respective path P; and the j-colored lines above a 
path indicate that P; and Pj intersect in the (i, j)-projection, that is, they contain 
vertices with vectors that are identical when projected in the (i, j)-plane. The paths Pı 
and P» intersect in the (1, 2)-projection, P> and P intersect in the (2,3)-projection, 
but P; and P; do not intersect in the (1,3)-projection. The subpaths of Pı and P3 
where they intersect with Pz in the (1,2,3)-projection are depicted by a4,w1,Q3, 
and w3. The colors above each subpath (and also the first number therein) represent 
the labels of the respective subpath and the number (or sequence of numbers) display 
the sequence that led to the respective marbles (end vertices) of this subpath. 


the label i via path P; to some subpath of P’. Generalizing this, we consider 
for each o = (¢1, €2,...,£,) whether label ¢; could be “transported” from Pp, 
to Pe, from Pe, to Pe,, and so on until from Pe, _, to Pe, _,. While the idea of 
transporting labels would also work with triples (transport label a via path P, 
to path P.), we do not have any bound on the number of resulting subpaths 
(as for each triple there might be many such subpaths). The reason for using 
sequences is that we will show that for each o = (41, f,...,€n) at most one 
subpath of P,, can receive label £; via ø. 


In the following, we use set(r) := {¢1,...,€,} to denote the set with all entries 
in a sequence T = (f),...,,). We next define the crossing set C recursively for 
each ® C [k]. This should be seen as the set of marbles of a solution. We will 
then show a result similar to Proposition 5.14 for arbitrary k that then allows 
us to find the desired partition of paths. 


Definition 5.7. Let (G, (s;,ti)iejxj)) be an instance of k-DISJOINT SHORTEST 
PATHS and let P = (P;);ex] be a solution to this instance, that is, P; is the 
path between s; and t; in the solution. For each ® C [k] and each permuta- 
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tion o = (41, .. . , 4a) of ®, we define the crossing set C7 and the endpoints 7 (c) 
of intersections as follows. 


e If |®| = 1 with o = (i), then let C7 := T(o) := {s;, ti}. 
e If || = 2 with o = (i, j), then let 
T(o) = {a's (P,),wi#(P)} and C7 = M$ (Pi) \ {L}- 
e If |®| = 3, then let Ostart ‘= (41,. 28-1) and Oend ‘= Poa acces bias 
We denote by Q the maximum common subpath of Pest (Ostart)] 


and Pr, [7 (Cap tal) Tf Tlostar) = {L}, Teena) = {1}, 
or V(Q) = 9, then let T(o) = C7 := {L}. Otherwise, let 


P= Pha [T (oena)], 


T (a) = {ap (Q),wp"\(Q)}, and 
C? = (Mp (Q) U Mg (P) \ {1}. 


The set C := |J, C7 is the crossing set of P. 


Observation 5.17. Let o := (4,..., 4a) be any permutation of any ® C [k]. 
If T(o) £ {L}, then 


(i) T(a) < Pog, and 
(ii) T(o) is c-colored for each ce ®. 
In particular, crossing sets and endpoints are well-defined. 


Proof. We prove both claims by an induction over |®|. For |®| = 1, note 
that T(o) = {s¢,,te,}. Clearly {sy ‚te, } C Pr, as these are the ends of Py, and 
the pair {s,, ‚te, } is by definition ¢,-colored. 

Now assume that both claims hold for all ®’ with |®’| < |®|. Since T(o) # {L}, 
f [207 h,k 
it holds that 7 (co) = {ap '*'(Q), wp '*'(Q)}, where 


Q = Prai lT (i, 42, -- +» 4a] 9 Pao lT (sp 4e- 


if |®| > 3 and Q = Pù if |®| = 2. Note that V (Q) 4 0 and hence if |®| > 3, then 
by induction hypothesis Q C Py,,,_, and Q is c-colored for each ce © \ {la}: 
If |®| = 2, then Q = Py, = Piaj, and Q is by definition (-colored. Thus, 


T(c) = {28 (Q) we (Q)} 
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is well-defined and hence 7 (0o) C Phe, Moreover, by Observation 5.7, it holds 
that 7(c) is c-colored for each c € (® \ {4a} U {4a)} = ®. 


Note that Observation 5.17 states that, for each sequence ø = (41, ¢2,..., 4a), 
the set 7 (ø) describes a pair of vertices in Py,,,. The next lemma states that 
for any sequence o’ = (€;,:41,---,¢@)) with ö > 1 it holds that the subpath 
of Py, between the two vertices in 7 (ø) is a subpath of the one between the 
two vertices in T(o’), that is, if we add more entries to the front of o’, then we 
get smaller and smaller paths. 


Lemma 5.18. Let o := (01, l2,..., ţa) be any permutation of any ® C |k] 
with || > 2. If T(c) # {1L}; then Pis [T (7)] < Pos, [T ((l2, £3, ioe ,£\a1))]- 


Proof. We prove the statement by a case distinction over |®|. If |®| = 2, 
then the statement is trivial as 7(o) C Py, by Observation 5.17 and by 
Definition 5.7 T((€\a))) = {sa as}: If |®| > 3, then note that Py, [T(o)] 
is by definition of a- and w-vertices the maximal subpath P of P,,, such that 
there is a subpath 


Q C Pao_sIT((e1, 42,---,2a)-1))] with Pet Q. 


Analogously, Pi a [T ((¢2, é3,---,£)~|))] is the maximal subpath P’ of Pi ẹ such 
that there is a subpath 


Q'S Prai- [7 (E23... ,40]-1))] with P' = Q. 


Since Q C Q’ by Definition 5.7, it also holds that PC P’. 


The next lemma states that when “transporting” the labels via a permu- 
tation o = (1, ¢2,...,¢\6;), then the intersecting subpath P in the target 
path Py, “agrees” in all coordinates in set(o) with the subpath Q of Piei 


where the label is transported from, that is, P =! Q. 


Lemma 5.19. Let ® C |k] with || > 2. Let o := (Lı,la,...,ls|) be any 
permutation of ®. If T(o) # {1}, then Pia [T(a)| =° Q’ for some subpath Q' 
of Q = Pos [Tel Qaj—1))] N Pası-ıl7 (Ejay, 4e-1))]. 


Proof. We will again use induction over |®| to prove the claim. For |®| = 2, the 
claim follows from Observation 5.7. For |®| > 3, let start = (41, €2,.-. eet) 
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and Gena := (l2, ¢3,---,€)g |). By Lemma 5.18, Pr, [7 (e)] < Pos [7 (ena)] and 
hence there is by induction hypothesis a subpath 


R' C Po [7 (Gena)] N Para [Tom 48-1) 


with Pia [T7(o)] =set(vena) R’. Furthermore, by Definition 5.7 Te) AL 
and hence by induction hypothesis there is some subpath 


Q CQ with Pe, [T (0) ="* Q'. 
Note that R’ ="*! Pp [T(o)] ="*! Q’ and that R’ and Q’ are both subpaths 
of Q. Finally, since Q C Pe,)_,[T((4a), 4ej-1))] is by Observation 5.17 Cja)- 


colored and R’ =^?! Q’, it holds that R’ = Q’. Thus, Q! =) Pos [T(e)l 
which proves the claim. 


In Subsection 5.2.1, we defined marbles, that is, specific vertices of two 
paths P,Q such that when splitting P and Q at these vertices, then each 
resulting subpath P’ of P and Q’ of Q fulfill either P’ =° Q’ or P’ and Q’ are 
avoiding. In this subsection, we generalized the notion of marbles to more than 
two paths at the expense of restricting them to solution paths. We conclude this 
subsection with the notion of marble paths, the final link between marbles and 
crossing sets that will allow us to guess marbles and then compute shortest paths 
between them almost independently. By that, we mean that we will define labels 
for each subpath between marbles such that paths with different labels are 
avoiding and paths with the same labels have a common color. Afterwards, we 
will show in Section 5.3 how to compute disjoint paths between marble pairs 
with a common color. 


Definition 5.8. An i-marble path T is a set of vertices such that {s;,t;} C T 
and for each u,v € T the pair (u,v) is colored. A segment S of an i-marble 
path T is a subset of T containing two vertices denoted by start( S) and end($) 
and all vertices v € T with start(S) <f v <ê end($). A segment is minimal if it 
contains exactly two vertices, and it is j-colored if (start(S),end(S)) is j-colored. 
A path P follows S if P is i-colored, has end vertices start(S) and end(S), 
and S C V(P). Two segments S and S are avoiding if each path P that 
follows $ and each path P’ that follows S” are pairwise avoiding. Two marble 
paths are avoiding if all their segments are pairwise avoiding. 


Before we state the main result of this section, we will prove a series of 
lemmata that involve minimal segments of marble paths. The first one states 
that adding more vertices to avoiding segments still results in avoiding segments. 
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Lemma 5.20. Let S be a segment of an i-marble path and let U be a segment 
of a j-marble path such that S and U are avoiding. Let S’ D S and U’ DU be 
two segments with 


start(S’) = start(S), end(S’) = end(S), 
start(U’) = start(U), and end(U’) = end(U). 
The segments S’ and U’ are avoiding. 


Proof. Assume towards a contradiction that S’ and U’ are not avoiding, that is, 
there are minimal subsegments S* of S’ and U* of U’ and paths P and Q such 
that P follows S* and Q follows U* and P and Q are not avoiding. Let $” be 
the minimal segment in 5 with 


start( S”) <’ start(S*) < end(S*) < end($”). 
Note that S* C S”. Analogously, let U” be the minimal segment in U with 
start(U”) <’ start(U*) < end(U*) < end(U”). 
Since P follows S*, it is i-colored and contains all vertices in S*. Hence it 
contains all vertices in S” C S* and thus follows S”. Analogously, Q follows U”. 


Hence S” and U” are not avoiding and thus $ and U are by definition not 
avoiding, a contradiction. 


The next two lemmata state that segments of marble paths P and Q defined 
by vertices in M are avoiding unless the ends of the segment are between the 
respective a- and w-vertices. The first lemma states that if a% (Q) = 1, then 
the two marble paths are completely avoiding. 


Lemma 5.21. Let (sp,tp) be an a-colored pair and let {sg,tq} be a b-colored 
pair. Let P be an a-colored sp-tp-path and let Q be a b-colored sg-tg-path. 
If a3? (Q) = L, then the marble paths M‘3°(Q) and ME’ (P) are avoiding. 


Proof. Note that since a%’ (Q) = L, it follows that {v € P | v €? Q} =. 
Assume towards a contradiction that there are segments S' of MSQ) and S’ 
of MG (P) that are not avoiding. If S and 5’ are not minimal, then by definition 
they contain minimal subsegments that are not avoiding. Hence we can assume 
without loss of generality that S and S’ are minimal. 
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Let P’ be an a-colored path that follows S and let Q’ be a b-colored path 
that follows S” such that P’ and Q’ are not avoiding. Let further 


P! g= P|sp, sp] e P'o P|tp, tp] and Q" = Qlso; sa’) e Q e Q[tg’, ta]. 


Note that P” follows MZ (Q) and therefore MSQ) C P”. Analogously, Q” fol- 
lows M6" (P) and hence M6 (P) C Q”. By Proposition 5.14, it holds that 


{ue P" |v €? Q"} = {ve P| ve Q}=9, 


that is, P” and Q” are avoiding. Since P” and Q” are avoiding, so are all 
subpaths of P” and Q”. Thus P’ and Q’ are avoiding, a contradiction. 


The next lemma deals with the case where a? (Q) # L. Recall that in this 
case we only consider segments that do not contain a% (Q) or w$? (Q). 
Lemma 5.22. Let (sp,tp) be an a-colored pair and let (sg,tg) be a b-colored 
pair. Let P be an a-colored sp-tp-path and let Q be a b-colored sg-tg-path. 
If as? (Q) Æ L, then let Sı and Sa be segments of the marble path MS? (Q) 
with start( S1) = sp,end(S1) = a$’ (Q), start(S2) = w% (Q), and end(S2) = tp. 
Let further S' be a segment of the marble paths MEP). Then Sı and S’ are 
avoiding and so are Sz and S’. 


Proof. Note that a? (Q) # 1, Observation 5.7 and Lemma 5.10 imply that 
{ve P| v €** Q} = Plag (Q), wg (Q)]. 


Assume towards a contradiction that S4 and S’ are not avoiding or S and S’ 
are not avoiding. Then there are paths P, that follows S1, Po that follows S2 
and Q’ that follows S’ such that P, and Q’ are not avoiding or P) and Q’ are 
not avoiding. Hence there are vertices v in Q’ and w in P, or P that are inner 
vertices with v = w. Let 


P* = Pı o Plat (Q), wg (Q)] e Pz and Q* = Q[sq, so] © Q' o Qlto', tal. 


Note that P* is a-colored as each of its subpaths is a-colored. Moreover, P* fol- 
lows MSQ) as it contains sp, a% (Q), wS (Q), and tp. Analogously, Q* fol- 
lows M&(P). Proposition 5.14 then states that 


{ve P* |ve®? Q'} = {ve P | v € Q} = Play’ (Q) wp (Q). 


Thus, w € P[a%’ (Q), w%°(Q)] which is a contradiction to the assumption that w 
is an interior vertex of the a-colored paths P; or Pz. 
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The final lemma generalizes the two previous ones from M (comparison of 
two paths) to C (sequences of paths). Unfortunately, it contains a lot of rather 
tedious case distinctions. We remark that solving the respective cases is not 
particularly difficult or interesting. 


Lemma 5.23. Let (G, (si,ti);ejx]) be an instance of k-DISJOINT SHORTEST 
PATHS, let P := (Pi)iepx be a solution to this instance, and let ® C [k]. 
Let o := (£1, l2,..., 4a) be a permutation of ®, let ostart = (41, ¢2,---,la)-1), 
and let C be the crossing set of P. Let g = (1,1 = a1, and j := (a). 


(i) If T(o) = {L} and T(osart) # {L}, then the two marble paths 
V(Pi[T (Gstart)]) AC and V(P;) NC 
are avoiding. 
(ii) If T(c) = {u,v} A {L} with u <Í v, then the two marble paths 
V (Pi[T (Ostart)]) MC and V(Pj[s;,u]) VC 
are avoiding and so are 


V(P,[T (Gstart)]) OC and V(P;[v, t;]) NC. 


Proof. We will prove both claims by induction over |®|. 
Base case: Let |®| = 2 and hence g =i and P,[T (start )] = Pi. 


(i) Since T(o) = {L}, it holds that ap (P;) = L. By Definition 5.7, it holds 
that MP (P;) C V(Pi)NC and MY (P;) C V(P;) NC. By Lemma 5.21, 
MP (P;) and My (Pi) are avoiding. Lemma 5.20 states V(P;) NC 
and V(P;) NC are avoiding since 


start(V(P;) NC) = s; = start(M%(P;)), 
end(V(P;) NC) = t; = end( M (P;)), 

start(V (Pj) NC) = sj = start (MP (Pi), and 
end(V(P;) NC) = tj = end(M}(P;)). 
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(ii) Since o = (i, j), it holds that u = a (Pi) and v = wp ’(P;). By Defini- 
tion 5.7, it holds that Mis NP 3) C V(P) OC and Mg (P, ;) <V(P NE. 


By Lemma 5.22, M* oe ;) aid {sj, u} are avoiding and so are Mi; 1 Pi) 
and {v, tj}. T hüs the Jana again follows from Lemma 5.20 and 


start(V(P;) NC) = s; = start(.M 7} (P;)), 
end(V(P;) NC) = ti = end(M}(P;)), 
start(V (P;[s;,u]) NC) = s; = start({s,;, u}), 
end(V(P;[s;,u]) MC) = u = end({s;, u}), 
start(V(P;[v, t;]) NC) = v =start({v,t;}), and 
end(V (P;[v, t;]) AC) = nz = end({v,t;}). 


Induction step: Let |®| > 3 and assume that the statement holds for all ®’ C [k] 
with 2< |®| < |®|. Let Oena = (la, b3, pes , 4a) and o’ := (l2, l3, on Lat). 


(i) Since T(o) = {L} and 7 (Ostart) # {L}, by Definition 5.7, there are three 
possible cases: 
T (Gena) = {4}; 
V(Q) = V(RiIT (start )]) WV (PilT(G, A) = 9, or 
APIT ona) Q) =L 
We will show that V(P;[T(ostart)]) OC and V(P;) NC are avoiding in each 
of the three cases. 
(1) We start with the case where T (dena) = {L}. Since T (start) A {L} 
it holds by Lemma 5.18 that 
OA V (PIT start)]) < VP (0) 


and in particular, 7(o’) Æ {L}. Since T (Cena) = {L}, T(o’) 4 {L}, 
and Cena is the permutation of a set ®’ with |®’| < |®|, the induction 
hypothesis states that 


V(P[T(0’)]) NC and V(P;) NC 


are avoiding. By definition, each subsegment of V(P,;[T(0’)]) NC is 
also avoiding V(P;) NC and since V(P,[T (ostart)]) < V(P;[T(o’)]) 
and T (Ostart) C C, it holds that V(P;[T (ostart)]) OC is a subsegment 
of V(P;[T(o’)]) NC. 
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(2) We continue with the case where 


nN 


V(Q) = V(PiIT (start)]) WV (PilT(G,))]) = 0. 
We consider the sequence (j,i) and the two cases 7((j,7)) = {L} 
and 7 ((j,7)) # {L}. Since |{j,7}| = 2 < |®|, the induction hypothesis 
states that if 7((j,2)) = {1}, then 
V(P;) NC and V(P;) NC are avoiding 
and if T((j,7)) # {L}, then 
V(P;)NC avoids both V(P, si, a (P;)])NC and V (P; [we HP apne. 


In the former case, since V(P;[T (start)]) QC is a segment of V(P;)NC, 
it holds by definition that V(Pi[T (ostart)]) MC and V(P;) MC are 
avoiding. In the latter case, since V(Q) =, it holds that 
VP, [T (start) ]) € 
V(Pi[T (ostart)]) s 


(Pilsi a$ (P;)]) or 

(Po (P;), tjl) 

Since the two cases are analogous, we assume without loss of generality 
the former, that is, V (P;[T (@start)]) <V(P: ilsna (Pi): 

Since V(P,[s:, ap J (P;)]) NC and V(P;)NC are avoiding and since 


V(PilT (start)]) NC C V(Pilsi, eB) (PA) NC, 


by definition V(P;[T (ostart)]) AC and V(P;) NC are also avoiding. 


It remains to analyze the case where a% 'T(oena)|@) = 1. We as- 
sume that T (Sena) Æ {L} and V(Q) #0 as we can otherwise use 
the proofs above. Assume towards a contradiction that V(P;) NC 
and V(P,[T (ostart)]) AC are not avoiding. Then there are mune 
segments S; C V(Pi[T (ostart)]) AC and Sj C V(P;) NC that are not 
avoiding. We consider the two cases 


Si C V (Pi[T (start)]) VV (PIT (G, i)))) = V(Q) and 
Si C V (Pi[T (start)]) \V(PilT (G, 4))))- 


Note that 7((j,i)) C C and that S; is minimal and hence this case 
distinction is complete. In the latter case, note that since S; and S; 
are not avoiding, there is a path R; that follows S; and a path Rj that 
follows S; such that {ve R; | v €> R;}\ S; 40. Then, it holds by 
Observation 5.7 {ve P; |ve'7 Pj} C V(P;[T((j,2))]). Moreover, by 
Proposition 5.14 {v € Ri | v e's Rj} CI V(P;[T((j,i))]). Hence 


{fve R; |v c Rj} C% Si, 


a contradiction. 


If Si C V(P,[T (ostart)]) OV (Pi[T (7, 2))]) = V(Q), then we distinguish 
between the two cases 


Sj C V(P;[T (gena)]) NC and Sj \ V(P;[T (oena)]) # 0. 
In the former case, it holds by Lemma 5.21 that MP Tto A 


and Mg (P;[T (ena)]) are avoiding. Since 


MB Tona) (Q) c V(P; [T (ena) |) NC and 


MGI (PjIT (gena)]) E VQ) NC, 


it holds that S; and Sj are avoiding, a contradiction. 


Finally, it remains to analyze the case where S; \ V(P;[T(oena)]) # 9. 
Since T (dstart) Æ {L} it holds by Lemma 5.18 that 


) # V(P; [T (Ostart)]) Cc V(P[T(o'))). 


and in particular, 7(o’) 4 {1}. Since by assumption T (dena) Æ {1}, 
the induction hypothesis states that 


V(P[T(o’)]) AC and V(P;[s;,start(T(oena))]) NC 
are avoiding and so are 

V(PIT(o)}) NC and V(P;lend(T(oma)),t;]) NC. 
Since, by Lemma 5.18, 


Si = V (P,[T (Cstart)]) N C a V(P[T(o’)]) N C 
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and since 


Si C V(P;[s;,start(T (gena))]) AC or 
Si C V(Pilend(T (Gena), t;]) OC 


it follows that S; and S; are avoiding, a contradiction. 


(ii) In this case it holds that 7 (o) = {u,v} 4 {L} with u <? v and it remains 
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to show that 

V(PilT (Gstart)]) AC and V(P;[s;,u]) AC 
are avoiding and so are 

V (PIT (oa) INC and V (Pyle, t;]) AC. 


Since both cases are analogous, we will only show that V(P,[T (astart)]) AC 
and V(P;[s;,u]) MC are avoiding. To this end, assume towards a contra- 
diction that there are minimal segments 


S; C V(P,[T (start )]) NC and Sj C V(P;l[s;, u) NAC 


that are not avoiding. 

We consider the two cases S;\V (P;[T ((j, i))]) # 0 and Si C V(P;[T((y, 2))]). 
In the former case, note that if {w, x} := T((j,i)) # {L} with w <t a, 
then it holds by Lemma 5.22 that V(P,[s;,w]) MC and V(P;) NC are 
avoiding. If 7((5,2)) = {L}, then it holds by Lemma 5.21 that V(P;) NC 
and V(P;) NC are avoiding. Hence in both cases S; and 5; are avoiding, 
a contradiction. 

Now assume that S; C V(P;[T((j,7))]). Since Si C V(P;[T(ostart)]), it 
holds by Lemma 5.18 that Ø # V (Pi[T (ostart)]) < V(Pi[7(0’)]) and that 


Si © V(Q) = V(P[T (ostart)]) O V(PilT (9, 7))))- 


Note that if in this case T(oena) = {L}, then by induction hypothe- 
sis V(P;[T(0’)]) NC and V(P;) NC are avoiding, and thus so are 
V(P,[T (start )]) AC =) Si and V(P;) AC 3 Sia 


a contradiction. 


It remains to analyze the case where {y, z} := T (cena) # {L}. We resolve 
this case with a final case distinction: 


Sy \ V(Pj[T (Gena)]) #9 or 5; C VP; [T (Gena))). 


In the former case S; C V(P;[s;,y]) or S} C V(P;[z,t;]). By Lemma 5.18, 
it holds that S; C V(P 1 [T (Ostart)]) C V(Pi[T(0’)]). Since by Lemma 5.22 


V(P;[s;, yl) OC and V(P;[T(0’)]) NC and 
V(Pilz, ty]) 0C and V(P[T(o’)]) N 


are avoiding, we conclude that S; and S; are avoiding, a contradiction. 


Finally, if S} C V(P;|T(oena)]), then it holds by induction hypothesis and 
Lemma 5.18 that 


Si C V(Pi[T (start) |) NC C V(P,[T(o')]) NC and S; C V(P;[T (Sena) ) AC 


are avoiding, a contradiction. 


We conclude this section with the definition of labels of segments and the 
proof that they guarantee that paths following two segments have either a 
common color or are avoiding. To this end, let S = {u,v} be a segment of 
an i-marble pathwith u <’ v. The set of labels of S (labels[.$]) is defined as 


{a | Io = (h = a, b2, ..., ġo] =i). {0,0} = Ta) ALL Aa <$ u <t us wh. 


Proposition 5.24. Let r (si, ti)iej]) be an instance of k-DISJOINT SHORTEST 
PATHS and let P = (P;)icjk] be a solution to this instance. Let i,j € |k] and 
let T; = V(P;) NC be an i- a path and T; C V(P;) NC be a j-marble path. 
Let Si C T; and S; C Tj be two minimal MEA If labels[S;] # labels[5;], 
then Si and Sj are o iA: 


Proof. We start with the case where i ¢ labels[S;]. Then either 7 ((i, j)) = {L} 
or S; N T;[u,v] = 0, where {u,v} := T((i,5)) # {L}. In both cases, S; and S; 
are avoiding by Lemma 5.23. The case where j ¢ labels[$;] is analogous. 

It remains to consider the case where 


i, j € labels[S;] Nlabels[S;]. 


Let without loss of generality be d € labels[S;]\labels[S;]. By definition of labels, 
there is a set ® = {41, l2, . . . , jg) } and a permutation o = (41, f2,...,¢)@)) of ® 
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such that £ı = d, £4, = i, T(o) = {a,w} # {L}, and S; C Pla,w] NC. We 
consider the two cases j ¢ Band j € ®. 

If j ¢ ®, then let o’ := (d= 4, b2,- . - , 4a] = i, j) and we distinguish between 
the two cases T(o’) # {L} and T(o’) = {1}. If T(o’) # {L}, then by 
definition of labels and since d ¢ labels[S;], it holds that S; \ V(P;[T(0’)]) 4 0. 
Lemma 5.23 states that Sj and each minimal segment of V(P;[T(a)]) NC are 
avoiding. Hence, S; and S; are avoiding as S; C V(P;[T(o)])NC. IET (o) = {L}, 
then, by Lemma 5.23, it holds that V(Pi[T(a)]) NC and Tj = V(P;) NC are 
avoiding. Thus by definition S; and S} are avoiding. 

It remains to consider the case where j € ® = {f1,2,...,€jo;}. In this 
case let x € [2,|®| — 1] such that j = l, and let o; := (d = 1, ¢2,...,4) 
for all h € [x,|®|] (x < h < |®|). Note that os := (d = 41, 2,...,42 = J) 
and ojo, = o. Since S; C V(P,[T(e)]) NC, it follows that T(o) # {1} and, by 
definition of T, it holds for each h € [x, |®|] that T(o,) # {1}. Lemma 5.19 then 
states that for each A € [a, |®|] and each subpath Qla; of Pi[T (o)))] = PilT(o)] 
there is some subpath Qn of Pr,[T(on)] such that Qar —set(tr) Q,_1. Let Q\e| 
be such a path with {sQ),),t@)s,} = Si. Thus, 


Ola) = Visit Sn 27 Qr 


and in particular {start(S;),end(S;)} CÍ V(P;[T(o.)])- 
Since S; C V(P;[T(o)]) NC, it holds by Lemma 5.18 that 


SEV Te )NeC CVA ee) Ne. 
Hence, it holds that {start(S;),end(S;)} is j-colored and thus it holds for each 
path Q; that follows S; that Qi CÍ V(P;[T(ex)]). Since d € labels[S/] for 
each Si C V(P;[T(a2)]) NC, d ¢ labels[S;], and S; is minimal, it follows that 
Si N (V(P;[T(o2)]) NC) © {start(S;), end(S;)} OT (ez). 


Moreover, since P; is strictly increasing in the j* coordinate, it follows for each 
path Q; that follows S; and each path Q; that follows S; that 


VQ) NVQ) VON n VET er. 


Hence, each such pair of paths is avoiding (no two inner vertices share the same 
vector) and thus it holds by definition that S; and S} are avoiding. 
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5.3 An XP-Algorithm for k-Disjoint Shortest 
Paths 


In this section, we present our main theorem, that is, an XP-algorithm for 
k-DISJOINT SHORTEST PATHS with respect to the number k of terminal pairs. 
In a nutshell, we first guess all marble paths T; and the respective ends 7 
corresponding to the crossing set C of some solution (if one exists). We then 
compute all minimal segments of each marble path T;, compute their respec- 
tive labels, and partition the segments such that all minimal segments in the 
same part of the partition are strictly monotone in a common coordinate and 
two minimal segments in distinct parts of the partition are avoiding. The crucial 
improvement over the algorithm by Lochet [Loc21] is that our partition is much 
smaller. Afterwards, we find via dynamic programming for all segments in one 
part of the partition disjoint paths that follow the respective segments. 

To this end, we introduce clayered DAGs and the problem p-DISJOINT 
PATHS ON c-LAYERED DAGs. For a graph G with vectors v for all v € V 
(as defined in Subsection 5.2.1), the c-layered DAG D. of G is the directed 
graph De = (V, A), where A = {(x,y) | {x,y} € E(G) A Yo — Z° = 1}. Notice 


that a path P = (v1, v2,...,Up) is c-colored if and only if up -% = 1 for 
all i € [p — 1] or ¥;° -vrr = 1 for all i € [p — 1]. Let Pm = (Up, Vp-1,---, 01) 


be the mirrored path of P. Then, P is c-colored if and only if Pm is and hence if 
and only if the directed path (V(P), A(P)) or the directed path (V(P), A(Pm)) 
is a path in De. Finally, observe that A(P„) = A~‘(P), that is, Pm and P have 
the same vertices but the edges are oppositely directed. 


Observation 5.25. A path P in G is c-colored if and only if (V(P), A(P)) 
or (V(P), A~1(P) is a path in the c-layered DAG De of G. 


We continue with a definition of p-DISJOINT PATHS ON c-LAYERED DAGS. 
Here, we are given a clayered DAG D. and a list (5;,ti)ie[pj of (possibly 
intersecting) terminal pairs. We then ask whether there are pairwise internally 
vertex-disjoint s;-t;-path in De. Formally, it is defined as follows. 

p-DISJOINT PATHS ON c-LAYERED DAGS 

Input: A c-layered DAG De and p pairs (5;, ti) eq) of vertices. 

Question: Are there p internally vertex-disjoint paths P; in De such that P; 
is a shortest s;-t;-path for each i € [p]? 


With these definitions, we can state our algorithm. Algorithm 5.1 provides 
pseudo-code. 
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Algorithm 5.1: Our algorithm for k-DISJOINT SHORTEST PATHS. 

1 function solve(G, (s; tijiet) 

2 foreach guess (T;);ejr); Ends of the crossing set do 

/* We assume subsequently that the guesses are correct, that is, if 
there is a solution P := (Pi)iej], then T; = V (P;) NC for 


all i € [k] and Ends = T. ur 
3 foreach i € [k] do 

P; + ý // P; contains all segments corresponding to D; 
5 foreach minimal segment S of some T; with i € [k] do 

marks|$;] — 0 
7 foreach permutation o = (lı,la,...,i) 


with Ends(o) = {a,w} 4 {L} 
and a <t start(S;) <ê end($;) <Ë w do 


8 | marks[S;] +- marks[S;] U set(o) 
9 j © min marks[S] 
10 x + argmin{7? |v € {start(S),end(S)}} 
11 y + argmax{ 77 |v € [start(S),end(S)}} 
u | Pi = Put y)} 
13 foreach j € [k] do 
14 Order P; = ((21, Y1), (£2, ya), ---) such that 77 < T3 <... 
15 if all instances (D;, P;) of |P;|-DISJIOINT PATHS ON i-LAYERED 


DAGS are yes-instances and the combined solutions form a 
solution of k-DISJOINT SHORTEST PATHS then 
16 E return true 


17 return false 


Fortune et al. [FHW80] showed that p-DISJOINT PATH on DAGs can be 
solved in n°) time. Since c-layered DAGs are DAGs, we could use their 
algorithm in Algorithm 5.1 and achieve a running time of n°"), However, to 
drop the Landau notation in the exponent, we show that p-DISJOINT PATHS ON 
C-LAYERED DAGS can be solved in O(n?*') time. Afterwards, we show that 
Algorithm 5.1 is correct and runs in O(n!6*+k'+k+1) time. 
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The idea behind the dynamic program for p-DISJOINT PATHS ON c-LAYERED 
DAGs is as follows. Given an i-layered DAG Dj, a number p, and a set of 
terminal pairs (s;,t;),j € [p], where 57’ < i and 55° < 57 for all j < £ € [p], 
the dynamic program is a table T[x1, £2, ..., £p] € {true, false} that stores true 
roughly if the following two criteria are fulfilled. 


i) All z;, xj are pairwise different or z; € {s;, ti} and x, € {s;,t;}, and 
ii) there is a set of internally vertex-disjoint s;-x; paths. 


Thus, T[t1,t2,...,tp] = true if and only if there is a set of internally vertex- 
disjoint s,-t;-paths. 

Note that if s; =° s; and t; =° t; for all i,j € [p], then there is a fairly 
straightforward dynamic program for p-DISJOINT PATHS ON c-LAYERED DAGS. 
Store for increasing values of d € [87°, t; | and for each tuple (x1, #2,..., £p) of 
vertices with 77° = d for all i € [p] whether there are pairwise disjoint paths 
from s; to x; (the paths may possibly share their end vertices s; and/or t; 
if x; = ti). The table corresponding to this dynamic program has O(n - n?) 
table entries (at most n values for d and for each d there are at most n? 
sequences of p vertices). Each table entry can be computed in O(n?) time 
by iterating over all table entries for d— 1 and (21,25,...,2,) and checking 
whether (27,21), (25, 22), -, (p, 2p) € A. This would lead to an overall run- 
ning time of O(n??*!) for p-DISJOINT PATHS ON c-LAYERED DAGs. Note 
further that ensuring that s; =“ sj and t; =° tj is not difficult either. One can 
simply replace each s; and t; with new terminals and add paths of according 
lengths between the new and the old terminal vertices. An example of this 
roughly outlined construction is given in Figure 5.5. However, there is another 
dynamic program that is faster (O(nP*!) time instead of O(n?P+!) time) and 
that also works for general DAGs. Basically, instead of moving all x; from one 
layer to the next in one step, we order them and move the x; that is first in 
this ordering. This has the advantage that for computing one table entry, we 
only have to consider O(n) table entries instead of O(n”). 


Lemma 5.26. An instance of p-DISJOINT PATHS ON DAGS on a graph 
with n vertices can be solved in O(nP+t) time. 


Proof. Let D = (V, A) be a DAG and let (s;,t;);ejp] be a set of p terminal pairs. 
We define V°"4 := Vic jp {5i ti} to be the set of all terminals. We also choose 
an arbitrary topological order of D and denote by u < v that u comes before v 
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Figure 5.5: Left-hand side: An example of 3-DISJOINT PATHS ON c-LAYERED DAGS. 
The c-coordinate of vertices is illustrated by their horizontal position. The terminal 
pairs are (sı,tı), (s2,t2), and (s3,t3) and s2 = s3. A solution is highlighted. 

Right-hand side: An equivalent instance in which s; =° s; for all i,j € [p]. The 
smaller vertices are the vertices that are newly introduced by the construction. The 
corresponding solution is again highlighted. Note that since s2 = s3, after adding s, 
and s}, the solution is not internally vertex-disjoint if sa was not duplicated. 


in this topological order. We assume without loss of generality that si < sj 
for all i < j € [p]. We further assume that s; < t; for all i € [p] as otherwise 
there can be no path from s; to t; and that p < n as we can iterate over all 
pairs (s;,t;) and delete those that are connected by an arc (s;,t;) € A. All 
remaining paths have at least one inner vertex that has to be from V \ V°"¢ 
and that has to be unique for each path. Hence, if there are at least n+ 1 pairs 
remaining, then the instance has no solution. 

We build a table T[z1, £2,..., £p] € {true, false} that stores true if and only 
if the following three criteria are fulfilled. 


i) Li € V\ (yag \ {s;, ti}), 
ii) Si ~ x; < t; for all x; € {si ti}, and 


iii) there exist s;-x;-paths such that each inner vertex of each of these paths is 
in V \ Ven4 and that each vertex in V \ V°"4 is contained in at most one 
of these paths. 


If the table is completely filled, then there is a set of internally vertex-disjoint 
shortest s;-t;-paths if and only if T[t1, to,...,t,] = true as the first two require- 
ments are trivially fulfilled. We initialize the table with T[s1, s2,..., Sp] := true 
as internally vertex-disjoint s;-s;-paths trivially exist. Moreover, for each tu- 
ple (a1,...,%p) € VP if x; < s; or ti < x; or a; € V™ \ {s;,t;} for at least 
one i € [p], then we set T[x1,...,2p] := false. Note that there are n” possible 
tuples and initializing each entry takes O(n) time. 
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We next show how to compute the entries of T. To this end, for some 
tuple (x1, £2, ..., Zp), let ze be a vertex such that x, # se and x; < xy for all x; 
with i € [p] and x; # si. Moreover, let 


N*(z;) := {v | (v, xi) € AAV E (V \ (V™ \ {5,})) \ {fires Cpe} bs 


Finally, let 


1 
Tleı,&a,...,2p] = V T[©1,22,..-,€e-1, 2, Le41,---, Lp]. 
xp EN* (xe) 


We now show by induction on the sum of positions in the topological order of 
all x; that T[z1, £2,..., £p] = true if and only if the three criteria are fulfilled. 
In the base case, T[s1, 52,..., Sp] = true, or there is some x; such that ©; < si 
and therefore T[x1,22,...,2p| = false. Note that there is no s;-x;-path in the 
latter case. 

Now to show the statement for some table entry T[x1,v2,..., £p], assume 
that the statement holds for all table entries T[a,25,...,x/,] such that x; < a; 
for all i € [p] and x, < x; for at least one j € [p]. To this end, first assume 
that T[r1,22,...,@p] = true. Since T[z1, £2,..., £p] = true, it was not set 
to false in the initialization and thus i) and ii) are satisfied. By construction, 
there isan xp € N*(z;) such that T[z1,29,...,2e-1,2),€e41,.-.,@p] = true. By 
induction hypothesis, there are internally vertex-disjoint s -x/- and s;-x;-paths 
for all j € [p] \ {2} such that se < x), < te and a, € V \ (V4 \ {82, te}). Since 
by definition of x; it holds that x; < x; for all x; with i € |p] and x; Æ si, 
it holds that x, is not contained in any of the s;-x;-paths for i € [p]. Hence 
the s,-x/-path can be extended by the edge (x7,x,) and the resulting path 
combined with the other s;-x;-paths satisfies iii). 

To show the other direction assume that £1, %2,...,2p» satisfy i) to iii). Then 
consider the s¢-x-path and the predecessor x, of xe. Note that x, exists as 
otherwise x; = s; for all i € [p] and hence we are in the base case. By con- 
struction, 2, € N*(x~) CV \ (Ve \ {s;}). Note further that x, < xe < te, 
implying x, # te and hence i) is also satisfied by zp. Further, since there 
is an sy-x/,-path (a subpath of the s¢-ae path), it holds that s < £p < ze < te 
and thus x, also satisfies ii). Finally, iii) is also satisfied by the s,-x/-subpath 
combined with the other s;-x;-paths. The induction hypothesis then states 


that T[z1, 29,066, 2, Zp, El41;---; Lp] = true. Since xp € N*(a,), it holds 
that T[z1, £2,..., £p] = true. Thus, the statement holds for all table en- 
tries T[r1,22,..., £p]. 
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It remains to analyze the running time. There are at most n? possible table 
entries and computing one takes O(n) time as V°"4,£, and N*(a,) can be 
computed in O(p + n) C O(n) time and iterating over at all neighbors of x, 
takes O(n) time. Hence the overall running time is O(n?*"). 


After showing how to solve the subproblems, it remains to show that Algo- 
rithm 5.1 is correct and to analyze its running time. We start with the analysis 
of the running time. 


Lemma 5.27. Algorithm 5.1 runs in O(k  n!C®k}+k+1) time. 


Proof. First, observe that there are at most k- k! different permutations of 
subsets of k objects as there are exactly k! permutations of exactly k objects 
and each of these can be truncated at k positions to get any permutation of any 
smaller (non-empty) subset of objects. Second, observe that by Definition 5.7 
there are at most eight vertices guessed for each sequence o as if p (Q) # L, 
then at (Q) = w (Q) = 087 (Q) = we (Q) = L. Hence, at most 8k- k! vertices 
need to be guessed, which requires at most n®*'*! attempts. 

Next we analyze the running time of each iteration of the main foreach-loop 
in Algorithm 5.1. Notice that by Definition 5.7, for each sequence ø there 
are at most four vertices on a marble path T; and that each of these vertices 
increases the number of minimal segments S on T; by at most one. Note that for 
each o the set C7 contains vertices from at most two paths. Thus, we create at 
most 8k-k! new segments overall. Since we start with k marble paths, there are 
at most 8k-k!+k minimal segments. Thus, there are at most (8k-k!+k)-(k-k!) 
iterations of the loop in Line 7, each of which takes constant time. Each iteration 
of Line 14 can be done in O(n) time using bucket sort and hence the overall 
running time for all iterations is in O(n - k). 

Next, there are k instances of p;-DISJOINT PATHS ON i-LAYERED DAGS 
that are solved using Lemma 5.26, where p; < 8k-k!+k for all i € [k]. By 
Lemma 5.26 the running time for solving one instance is O(n®**'+*+1) and the 
running time for solving all instances is hence O(k-n®®'*'tk+1), Lastly, we verify 
in Algorithm 5.1 that the solutions found can indeed be merged into one solution 
for k-DISJOINT SHORTEST PATHS. Note that we only stated the decision version 
of p-DISJOINT PATHS ON c-LAYERED DAGs but the actual solution can be 
found using a very similar algorithm where we do not only store true or false 
in the table T but also some set of disjoint paths corresponding to each table 
entry that stores true. Verifying a solution can for example be done in O(k- n) 
time by iterating over all solution paths and verify that between each pair of 
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consecutive vertices there is an edge, that all paths are shortest paths, and that 
all paths are internally vertex-disjoint. This can be done by marking all inner 
vertices of each path and if some vertex is already marked once and visited 
again, then return false and otherwise return true. Thus the overall running 
time of Algorithm 5.1 is 


O(n8**.((8k-kI+k): (kek) +n kth nët RFT nk) C O(k 1 ORR), 


For the correctness of Algorithm 5.1, we need to show that each part of 
the partition of minimal segments can be solved independently. This follows 
from Proposition 5.24 together with the fact that Algorithm 5.1 exhaustively 
tries all possibilities for the crosssing set C. Together with Lemma 5.27, this 
implies our main theorem. 


Theorem 5.28. k-DISJOINT SHORTEST PATHS is solvable in O(k - n!Ckk!+k+1) 
time. 


Proof. We use Algorithm 5.1 and focus on the correctness as the running time 
is already analyzed in Lemma 5.27. If Algorithm 5.1 returns true, then Line 16 
is executed and a solution is verified. It remains to show that if there is some 
solution, then Algorithm 5.1 returns true. If there is some solution P = (P;);ek]; 
then let C be its crossing set (Definition 5.7). Then, there is some iteration of 
Line 2 where all guesses are correct, that is, Ends = 7 and T; = V(P;) VC. We 
now consider this iteration of Line 2. 

Observation 5.17 states that for each sequence o and for each segment S 
with {start(S),end(S)} = Ends(o) = T(c) the pair {start(S),end(S)} is c- 
colored for each c € set(o). Hence the same also holds for each minimal 
segment S C S. By Line 7, there is a solution where the shortest paths 
between the endpoints of each minimal segment S are strictly c-monotone for 
each c € labels[S]. Note that labels[S] = marks[S] in this iteration of Line 2. 
Hence each path following S is strictly c-increasing for each c € marks[S] and 
by Observation 5.25 this shortest path is contained in D.. Hence we can find 
some solution for each minimal segment using Lemma 5.26 such that all paths 
for these minimal segments with the same marks are internally vertex-disjoint. 
Since marks[S] = labels[.S] for all minimal segments, by Proposition 5.24, all 
shortest paths between endpoints of minimal segments with different marks are 
internally vertex-disjoint. Hence, the result computed by Algorithm 5.1 is a 
solution to k-DISJOINT SHORTEST PATHS and thus the algorithm returns true. 
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5.4 Concluding Remarks 


We provided an improved polynomial-time algorithm for k-DISJOINT SHORTEST 
PATHS for constant k. However, while the running time of Algorithm 5.1 
can certainly be further improved by some case distinctions and a further 
refined analysis, the algorithm is still far from being practical. We believe that 
Algorithm 5.1 can be improved to run in n2°™ time. It is left open whether a 
running time of nro is possible. 

Concerning generalizations of k-DISJOINT SHORTEST PATHS, we believe that 
Algorithm 5.1 can be modified to not only work for unit edge lengths but also for 
positive integer lengths. However, the case of non-negative edge lengths seems 
much more difficult as edges with length zero result in overlapping vertices in 
our geometric representation. Finally, if there are no k disjoint shortest paths 
for some constant k, then computing in polynomial time disjoint paths with min- 
imum length is still an open problem (for k = 2 Björklund and Husfeldt [BH19] 
provided an O(n!!)-time randomized algorithm). 


100 


Part II 


2-SAT Programming 
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Chapter 6 


Tree Containment 


In this chapter, we investigate a problem from computational biology. Concern- 
ing 2-SAT programming, we present a general k-SAT program that shows that 
a relevant special case is polynomial-time solvable as the resulting program only 
contains 2-SAT formulas. Concerning problem-specific aspects, we introduce 
a new variant of a well-known problem in computational biology. The new 
version models a certain uncertainty regarding the history of evolution. We then 
identify a relevant special case of this new variant and a natural parameter that 
models the amount of uncertainty. We conclude with an equivalence between 
the identified special case and k-SAT in the sense that there are reductions 
from and to k-SAT, where the value of k in both reductions matches the value 
of our identified parameter. This proves that the special case is polynomial-time 
solvable for k < 2 and NP-hard for k > 3. 

With the dawn of molecular biology also came the realization that evolutionary 
trees, which have been widely adopted by biologists, are insufficient to describe 
certain processes that have been observed in nature. In the last decade, the 
idea of reticulate evolution, supporting gene flow from multiple parent species, 
arose [CCR13, TR11]. Reticulate evolution is described using “phylogenetic 
networks” (see the monographs by Gusfield [Gus14] and Huson et al. [HRS10] 
or the formal definitions in Section 6.1). A central question when dealing 
with phylogenetic networks is whether or not different phylogenetic networks 
provide consistent information. The corresponding problem is known as TREE 
CONTAINMENT and it has been shown to be NP-hard [ISS10, Kan+08]. 

In real life, we cannot hope for perfectly precise evolutionary history. In 
particular, speciation events (a species splitting off another) occurring in rapid 
succession (only a few thousand years between speciation events) can often 
not be reliably placed in the order as they occurred. Incomplete information 
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about a certain set of successive speciation events is called a soft polytomy 
and it is modeled by a non-binary vertex (a vertex with more than two parent 
species) in a phylogenetic network. We consider the information provided by two 
non-binary phylogenetic networks consistent if we can replace each non-binary 
vertex by some binary tree such that the resulting binary phylogenetic networks 
provide consistent information. 

In Section 6.2, we present first structural results for TREE CONTAINMENT 
with soft polytomies. In Section 6.3, we show that if one input network is a 
single-labeled phylogenetic tree and the other input network is a multi-labeled 
tree (for a definition, see Section 6.1), then TREE CONTAINMENT is polynomial- 
time solvable if each label occurs at most twice in the multi-labeled phylogenetic 
tree and NP-complete otherwise. The polynomial-time algorithm is based on 
the results from Section 6.2 and 2-SAT programming. 


6.1 Problem Definition and Related Work 


A phylogenetic network on a set X of taxa is a rooted, single-source, directed, 
and acyclic graph in which all vertices have in-degree at most one or out-degree 
exactly one and each leaf v (a vertex with out-degree zero) is labeled with one 
taxon « € X. We also say that v has label x. By default, no label occurs 
twice in a phylogenetic network, and we will make exceptions explicit by calling 
phylogenetic networks multi-labeled if a label can occur more than once. We 
say that it is (-labeled if each label occurs at most £ times and if we want to 
emphasize that a phylogenetic network is not multi-labeled, then we call it 
single-labeled. Vertices with in-degree at least two (and out-degree one) are 
called reticulations and the other vertices are called tree vertices. A phylogenetic 
network without reticulations is called a phylogenetic tree and a phylogenetic 
network or tree is called binary if each vertex has in-degree and out-degree 
at most two. Figure 6.1 shows an example of a binary phylogenetic network 
(left-hand side) and a phylogenetic tree (right-hand side). 

An important task in computational biology is to check whether two models 
of evolution are consistent. A relevant special case therein is whether a given 
phylogenetic network is consistent with an existing tree model or not [Gam+15]. 
A phylogenetic network N and a phylogenetic tree T are considered consistent 
if N displays T. For the definition of displaying, recall that subdividing an 
arc (u,v) in a directed graph refers to removing the arc (u,v) and replacing it 
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Figure 6.1: A 2-labeled phylogenetic network N (left-hand side) and a phylogenetic 
tree T (right-hand side). The respective topmost vertex is the only source and is called 
the root. The leaves are each labeled with one element of the set {a,b,c,d,e, f}. The 
parents of the leaves d and e in the left example are the reticulations in N and all other 
vertices are tree vertices. Removing the three smaller vertices (and all incident arcs) 
in N on the left-hand side and subdividing each dashed arc in T on the right-hand 
side once yields isomorphic’ trees. Hence, N displays T. 


by a new vertex w and two new arcs (u, w) and (w,v). A subdivision of a graph 
is the result of repeatedly subdividing arcs in it. 


Definition 6.1. Let N be a (possibly multi-labeled) phylogenetic network and 
let T be a single-labeled phylogenetic tree. Then, N firmly displays T if a 
subdivision of N contains a subdivision of T as a subgraph such that leaf-labels 
are respected, that is, each leaf v in T with label x is mapped to a leaf with 
label x in N. 


An example for Definition 6.1 is depicted in Figure 6.1. Based on this 
definition, TREE CONTAINMENT is defined as follows. 


TREE CONTAINMENT 

Input: A (possibly multi-labeled) phylogenetic network N and a single- 
labeled phylogenetic tree T. 

Question: Does N firmly display T? 


lIn this chapter, isomorphic always refers to an isomorphism respecting leaf-labels, that is, 
the isomorphism must map a leaf with some label A in N to a leaf with label A in T. 
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Kanj et al. [Kan+08] showed that TREE CONTAINMENT is NP-hard. Due 
to its importance in the analysis of evolutionary history, there have been 
several attempts to identify polynomial-time computable special cases [BS16, 
FKP15, Gam+15, GDZ17, Gun18, ISS10, Kan+08, Well8] as well as moderately 
exponential-time algorithms [GLZ16, Wel18]. Since the definitions for the special 
cases are rather technical and the results are not relevant for this thesis, we 
do not present definitions here but only refer the reader to the works by 
Fakcharoenphol et al. [FKP15] and Weller [Well8] for an overview. 

Motivated by the concept of soft polytomies, that is, incomplete knowledge 
about the order of a limited set of speciation events, we consider a notion we 
call soft displaying. The goal is to allow any high-degree vertex to be replaced 
by any binary tree such that the resulting phylogenetic network firmly displays 
the resulting phylogenetic tree. To this end, we consider arc contractions. 
Contracting an arc (u,v) in a directed graph refers to the process of “merging” u 
and v (and all incident arcs). Formally, vertices u and v are removed and replaced 
by a new vertex w. For each vertex x other than u or v, if (x,u) or (a, v) existed 
in the original graph, then the new graph contains an arc (x, w) and if (u, x) 
or (v, x) existed in the original graph, then the new graph contains an arc (w, x). 
A contraction of a phylogenetic network is the result of repeatedly performing 
arc contractions in it. We call a binary phylogenetic network B = (Vg, Ag) a 
binary resolution of a phylogenetic network N = (Vy, An) if N is a contraction 
of B. An example of contractions and binary resolutions is given in Figure 6.2. 
We call a surjective function x: Vg — Vy a contraction function of B for N if 
contracting all arcs (wv) in B with x(u) = x(v) results in a graph isomorphic 
to N. The notion of binary resolutions leads to the following definition of soft 
displaying. 


Definition 6.2. Let N be a (possibly multi-labeled) phylogenetic network and 
let T be a single-labeled phylogenetic tree. Then, N softly displays T if there 
are binary resolutions Ng of N and Tg of T such that Np firmly displays Tp. 


Note that, since each binary resolution of a binary phylogenetic network N is 
a subdivision of N, it holds that the concepts of firm and soft displaying coincide 
for binary phylogenetic networks. The notion of soft displaying naturally leads 
to the following definition of SOFT TREE CONTAINMENT. 


SOFT TREE CONTAINMENT 

Input: A (possibly multi-labeled) phylogenetic network N and a single- 
labeled phylogenetic tree T. 

Question: Does N softly display T? 
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Figure 6.2: Two phylogenetic trees B (left-hand side) and T (right-hand side). The 
phylogenetic tree B is binary. Contracting the arc between the two green vertices in B 
yields the green vertex in T. Analogously, exhaustively contracting any arc between 
two blue vertices in B yields the blue vertex in T. Since the result of contracting these 
arcs in B is isomorphic to T, the phylogenetic tree B is a binary resolution of T. 


An example of SOFT TREE CONTAINMENT is given in Figure 6.3. Throughout 
this chapter, we will mostly focus on SOFT TREE CONTAINMENT and for the 
sake of readability, we refer to soft displaying simply as “displaying”. To the 
best of our knowledge, we are the first to study SOFT TREE CONTAINMENT. In 
this thesis, we focus on the special case where N is a multi-labeled phylogenetic 
tree. This has three main reasons. First, TREE CONTAINMENT is known to be 
NP-hard even on binary phylogenetic networks and since TREE CONTAINMENT 
and SOFT TREE CONTAINMENT coincide for binary phylogenetic networks, SOFT 
TREE CONTAINMENT is NP-hard on binary phylogenetic networks (that is, N is 
not restricted to being a phylogenetic tree). Conversely, TREE CONTAINMENT is 
polynomial-time solvable when N is a phylogenetic tree [Gam-+15] and hence, the 
computational complexity of SOFT TREE CONTAINMENT on phylogenetic trees 
remains unclear. Second, reticulation events are comparatively rare especially 
when considering phylogenies of animals and so chances are that the input 
consists of phylogenetic trees (or phylogenetic networks with few reticulations). 
Hence, SOFT TREE CONTAINMENT on phylogenetic trees is a relevant special 
case from a biological perspective. Third, each algorithm for Sorr TREE 
CONTAINMENT on phylogenetic networks has to decide on a subgraph of N 
that is a phylogenetic tree and then verify that this phylogenetic tree softly 
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Figure 6.3: An example for SOFT TREE CONTAINMENT. In the top left-hand corner 
is a multi-labeled tree N and in the top right-hand corner is a single-labeled tree T. 
In the bottom right-hand corner is a subdivision of (a binary resolution of) T and 
in the bottom left-hand corner is (a subdivision of) a binary resolution of N. The 
subgraph in the bottom left-hand corner consisting of all vertices except for the two 
small vertices and all but the two dashed arcs is isomorphic to the phylogenetic tree 
in the bottom-right hand corner. This shows that N softly displays T. 


displays T. Thus, SOFT TREE CONTAINMENT on phylogenetic trees is a relevant 
special case from an algorithmic perspective. 


We conclude this section with some notation for the remainder of this chap- 
ter. In a single-labeled phylogenetic network, we use leaves and labels (taxa) 
interchangeably. A binary phylogenetic network B on three leaves a, b, and c 
is called a triplet and we denote it by ab|c if c is a child of the root of B. In 
Figure 6.1, the subtree rooted in the parent of the leaf labeled with a is the 
triplet bc|a. We denote by N, the subnetwork (or subtree) of N rooted in v, 
that is, the induced subgraph containing v and all its descendants. We denote 
the set of labels in a subnetwork N, by £(N,). Slightly abusing notation, we 
use n as the maximum number of vertices in N and T. 
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Recall that we use the notation v <p u to denote that a vertex v is a 
descendant of a vertex u in a directed acyclic graph (DAG) D. We use v <p u 
to denote that v is a descendant of u in D or v = u. Moreover, recall that the 
least common ancestor(s) (LCA) of a set V’ of vertices is a set L of vertices such 
that each vertex in L is an ancestor of each vertex in V” and no descendant of a 
vertex in L is an ancestor of each vertex in V’. In trees, the LCA of any set of 
vertices is always a set containing a single vertex and for the sake of readability, 
we will assume that the LCA in a tree is a single vertex. 

Let N = (V, A) be a phylogenetic network. Recall that suppressing a vertex v 
with one incoming arc (u,v) and one outgoing arc (v, w) refers to the procedure 
of removing v and both incident arcs and adding the arc (u, w) to the graph 
(if it does not already exist). For any subset U C V of vertices, we denote the 
result of removing all vertices v that do not have a descendant in U by N|v, 
and N||y is the result of suppressing all degree-two vertices in N|v. Such a 
phylogenetic network N||y can be computed in O(|U|) time [Col+00]. Moreover, 
if N is a phylogenetic tree, then N |; is the smallest subtree of N containing 
the vertices in L and the root of N. 

If N contains a subgraph S that is isomorphic to a tree T up to subdivision 
of arcs, then we simply say that N contains a subdivision of T. Slightly abusing 
notation, if an isomorphism maps a vertex v in T to a vertex u in $ (and thus 
in N), then we do not distinguish between u and v but say that both vertices 
are the same. Thus, S consists of all vertices in T and some vertices of in- and 
out-degree one. 


6.2 Single-labeled Trees 


In this section, we will develop a characterization of when a single-labeled 
phylogenetic tree softly displays another single-labeled phylogenetic tree. To 
this end, all phylogenetic networks are single-labeled in this section. The 
characterization will then be used in Section 6.3 to design an algorithm for SOFT 
TREE CONTAINMENT when the input network N is a multi-labeled phylogenetic 
tree. 

We start with a series of basic observations regarding the concept of displaying. 
First, note that a binary phylogenetic tree displays another binary phylogenetic 
tree if and only if they are isomorphic up to subdivision of arcs. Hence, if a 
phylogenetic tree T displays another phylogenetic tree T” on the same set of taxa, 
then there exist binary resolutions B of T and B’ of T such that B displays B’, 
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that is, B and B’ are isomorphic up to subdivision of arcs. Since isomorphism 
is a symmetric relation, T’ then also displays T. 


Observation 6.1. A phylogenetic tree T displays a phylogenetic tree T’ on the 
same label-set if and only if T’ displays T. 


For binary trees and, in particular, triplets, the concept of firm displaying 
is well-researched and we will use the following characterization to develop a 
characterization for when a phylogenetic tree softly displays another phylogenetic 
tree. 


Lemma 6.2 ([Dre+12, Chapter 9.1]). Let B be a binary phylogenetic tree. 
Let a,b,c € L(B) be three distinct labels. Then, B firmly displays the triplet ab|c 
if and only if 


LCA({a, b}) <p LCA({b, c}) = LCA({a, c}). 


Indeed, B is uniquely identified (up to subdivision and suppression of degree-two 
vertices) by the set D of displayed triplets, that is, B is the only binary tree 
displaying the triplets in D. 


Based on Lemma 6.2, we can now relate the two forms of displaying for 
triplets in non-binary trees. To this end, recall that in trees the LCA of a set of 
vertices is uniquely determined. Moreover, it is easy to verify that if it holds 
for three leaves a, b, and cin a tree T that LCAr({a,b}) <r LCAr({a, c}), 
then LCAr({a,c}) =LCAr({b,c}). Lemma 6.2 and the definition of soft dis- 
playing then immediately imply the following. 


Observation 6.3. Let T be a tree and let a,b,c € L(T). Then, 
(a) T firmly displays ab|c if and only if 
LCA ({a, b}) <r LCA({a, c}) = LCA({b, c}). 


(b) T firmly displays ac|b or beja if and only if T does not softly display ablc. 


The next observation states that, in trees, an arc contraction does not change 
the ancestor relation. This is important as it allows us to reason about LCAs 
in binary resolutions. 


Observation 6.4. Let T be a tree and let T’ be the result of contracting any 
arc in T. Let Y and Z be two sets of leaves common to T and T’. Then, 
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(a) LCAr(Y) <r LCAr(Z) if and only if LCAr-(Y) <r: LCAr (Z) and 
(b) if LCAr (Y) <T’ LCA, (Z), then LCA-(Y) <T LCAr(Z). 


Recall the example in Figure 6.2 and therein consider the contraction of 
the arc between the two green vertices in B. Observation 6.4 then states 
for Y = {f,g} and Z := {a, f,g} that LCAg({f,g}) <g LCAg(la, f,g}) if 
and only if LCAr(f,g) <r LCAr({a, f,g}). Note that this is indeed the case 
as the LCA of {a, f,g} is in both phylogenetic trees the root and the LCA 
of {f,g} is the respective (lower) green vertex. 

We now give a characterization of when a phylogenetic tree softly displays an- 
other phylogenetic tree. It is based on Lemma 6.2 and the following observation. 
Note that if in a tree B it holds that LCA({a, b}) <g LCA({b, c}) = LCA({a, c}), 
then there is no vertex v such that a,c € L(v) and b € L(v) as any ancestor 
of LCA({a, c}) is an ancestor of LCA({a, b}) <g LCA({a, c}). 


Lemma 6.5. Let N = (Vy,An) and T = (Vr,Ar) be two phylogenetic 
trees on the same leaf-label set. Then, N softly displays T if and only if, 
for all u € Vr and v € Vy, it holds that L(T,) < LUN), L£(Tu) 2 LUN»), 
or LT,)NL(N,) = 9. 


Proof. We start by showing that if N displays T, then for all u € Vr and v € Vy, 
it holds that L(T,) © L(Nv), L(Tu) 2 L(Nv), or L(Tu)NL(N,) = 0. Assume 
towards a contradiction that N softly displays T but there are u € Vy and v € Vr 
such that 


L(Na) É L(To), L(Na) 2 L(To), and LON IE) #0. 


This is equivalent to the statement that there are three taxa x, y, and z such 
that 


x E€ L(Nu)\L£(Tu), y E L(Nu)NL(T,), and z E€ £L(T,)\£L(N.): 


Since each label appears only once in N and T and N softly displays T, it 
holds that there are binary resolutions N? of N and TË of T such that NË 
and TP are isomorphic up to subdivision of arcs. Hence, there is a vertex u’ 
in N® with L(NB) = L(N,) and a vertex v’ in TP with £(T?) = L(T,). 
Since x,y € L(Nu) = L(NP) and z ¢ L(Nu) = L(NB), it holds that 


LCA({a, y}) <a ul <ys LCA({y, z}) = LCA({a, z}), 
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that is, N? displays ry|z. Analogously, T? displays yz|x. By Lemma 6.2, this 
contradicts the fact that TË and N? are isomorphic up to subdivision of arcs. 

We continue with the other direction, that is, we show that if for all ver- 
tices u € Vr and v € Vy it holds that 


L(Ta) C EN.) ET.) ə L(N.), or La) N L(N,) = 0, 


then N displays T. Using Lemma 6.2, we will show how to construct binary 
trees By and Br such that By is a binary resolution of N, Br is a binary resolu- 
tion of T, and both display all triplets that are firmly displayed by N or T. Since 
the constructions for both trees are analogous, we only focus on By here. Con- 
sider any vertex v € Vy that has out-degree at least three. Then, there are three 
labels a, b, and c such that LCAx({a,b}) = LCA y ({a,c}) = LCAw({b, c}). 
Let Ca, Cb, Ce be the three children of v in N such that a € L(Ne,), b E L(N..,), 
and c € £L(N..). We now consider the two cases whether or not 


LCAr({a,c}) = LCAr({a,b}) = LCAr({}, c}). 


If LCAr({a, c}) = LCAr({a, b}) = LCAr({b, c}), then neither N nor T displays 
one of the triplets ab|c, ac|b, or beja. Hence we arbitrarily replace the arcs (v, cp) 
and (v, ce) by a new vertex w and new arcs (v, w), (w, cp) and (w, ce). Note that 
the resulting phylogenetic tree firmly displays all triplets that N firmly displayed 
and the triplet beja. Since this procedure reduces the out-degree of one vertex 
of out-degree at least three and does not introduce new vertices of out-degree 
at least three, we can repeat this procedure until no vertex has out-degree at 
least three any more, that is, the resulting phylogenetic tree is binary. Observe 
further that By is trivially a binary resolution of By and therefore N softly 
displays By by definition. The construction of Br is analogous and whenever 


LCAwn({a, c}) = LCAn ({a, b}) = LCAn ({b, c}) and 
LCAr({a, c}) — LCAr(fa, b}) E LCA7({b, c}), 


then we construct Br to display the same triplet as By. 

Note that since By and Br are binary, they firmly display one of the following 
three possible triplets ab|c, ac|b, or bc|a for each triple (a,b,c) of labels. By 
Lemma 6.2, By and Br are isomorphic up to subdivision of arcs as binary trees 
are uniquely defined by their displayed triplets. Hence By is a subdivision of a 
binary resolution of both N and T and, as By is binary, N softly displays T by 
definition. 
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We conclude this section with a helpful lemma that lists some equivalent 
characterizations of soft displaying in relevant special cases. This lemma will be 
used to show hardness of SOFT TREE CONTAINMENT in Subsection 6.3.2. 


Lemma 6.6. Let T and T’ be phylogenetic trees and let B be a binary phyloge- 
netic tree, all on the same set X of labels. 


(a) T softly displays the leaf-triplet ablc if and only if 
LCA({a, b}) < LCA({b, c}) = LCA({a, c}). 
(b) T softly displays B if and only if T softly displays all triplets that B displays 
firmly. 


(c) T softly displays a tree T’ (and vice versa) if and only if there is a binary 
tree Bon X that is softly displayed by both T and T’. 


Proof. We prove the three statements one after another. To verify statement (a), 
note that, by definition, T softly displays ab|c if and only if there is a binary 
resolution Tg of T displaying ab|c. By Lemma 6.2, Tg firmly displays ab|c if 
and only if 


LCAr, (ta, b}) <Ts LCAr„(ta,c}) = LCA, ({6, c}). 
Since Tg is binary, this is equivalent to 
LCAr, (ta, b}) STs LCArz(ta,c}) = LCA, ({6, c}), 
which by Observation 6.4 is equivalent to 
LCAr({a,b}) <r LCAr({a, c}) = LCAr({b, c}). 


We next prove statement (b). To this end, first assume towards a contradiction 
that T displays B but a triplet ab|c that B displays firmly is not displayed softly 
by T. Then, {LCAr({a, b}), LCAr({a, c}), LCA r({}, c})} has a unique mini- 
mum x with respect to <r and it holds by statement (a) that x # LCAr({a, b}) 
(as otherwise T displays ab|c). Without loss of generality, let x = LCAr({a, c}). 
Since T has a binary resolution that is isomorphic to B up to subdivision of 
arcs, it holds that T is a contraction of a subdivision of B. Hence, Observa- 
tion 6.4 states that LCA g({a,c}) <r, LCAg({a,b,c}) and thus B displays ac|b. 
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Note that a binary tree cannot display ab|c and ac|b and thus we reached a 
contradiction. 

Now, assume towards a contradiction that T = (Vr, Ar) does not softly 
display B = (Vg, Ag) but displays all triplets that are firmly displayed by B. 
Since T does not display B, there are by Lemma 6.5 vertices u € Vr and v € Vg 
and labels x, y, and z such that x € L(T,) \ L(Bo), y € L(B.) \ £(Tu), 
and z € L(T,,) NL(B,). Thus, 


LCAr({z, z}) <r u <r LCAr({z, y, z}) and 
LCAg({y,z}) <p v <p LCAp({z, y, 2}). 


By statement (a), T displays xz|y and B displays yz|x. Since T displays all 
triplets that B displays firmly, T displays yz|x. Again by (a), we can conclude 
that LCAr({y, z}) <r LCAr({z, z}) <p u. Thus, y € L(u), a contradiction. 
It remains to show statement (c). By definition, T softly displays T” if and 
only if there are binary resolutions B and B’ of T and T’, respectively, such 
that B firmly displays B’. If such phylogenetic trees exist, then they are by 
Lemma 6.2 isomorphic up to subdivision of arcs. Thus, B is a binary resolution 
of a subdivision of T’ and the statement follows. 


6.3 Multi-labeled Trees and k-SAT 


In this section, we study SOFT TREE CONTAINMENT for multi-labeled phy- 
logenetic trees. We will show a strong connection between k-SAT and SOFT 
TREE CONTAINMENT on k-labeled phylogenetic trees in the sense that there 
is a polynomial-time reductions from k-SAT to SOFT TREE CONTAINMENT 
on k-labeled phylogenetic trees and a k-SAT program for SOFT TREE CoN- 
TAINMENT on k-labeled phylogenetic trees. This yields the dichotomy result that 
SOFT TREE CONTAINMENT on k-labeled phylogenetic trees is polynomial-time 
solvable if k < 2 and NP-hard if k > 3. We start with a characterization of when 
a multi-labeled phylogenetic tree softly displays a single-labeled phylogenetic 
tree T. 


Lemma 6.7. Let M be a multi-labeled phylogenetic tree and let T be a single- 
labeled phylogenetic tree on the same set X of labels. Then, M softly displays T 
if and only if M contains (as a subgraph) a single-labeled phylogenetic tree S 
on X that softly displays T. 
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Proof. We will first show that if M := (Vm, Am) softly displays T := (Vr, Ar), 
then M contains a single-labeled phylogenetic tree S that softly displays T. 
Note that, by definition, if M softly displays T, then there are binary resolu- 
tions Mg := (Vg, Ag) of M and Tz of T and subdivisions M3 of Mg and T3 
of Tg such that MS contains TS as a subgraph (respecting leaf labels). Let 9$ 
be the subgraph of M 2 that is isomorphic to T 5 . Let Sg be the phylogenetic 
tree that is the result of reverting all subdivisions from Mp to M3 in S$, that 
is, suppressing each vertex v that is contained in oe but not in Mg. Note 
that 5, is a subdivision of Sg and Sg is a single-labeled subgraph of Mp. 
Let x: Vg —> Vm be the contraction function of Mg for M, that is, the function 
mapping each vertex u in Mg to the vertex x(u) in M that u is contracted to 
when forming M. Moreover, let S be the result of contracting each arc (u,v) 
in Sp with x(u) = x(v). Note that for each vertex v in Mg it holds that 
all vertices u with x(u) = x(v) contract to a single vertex in M and hence 
these vertices form, by definition of contracting functions, a weakly connected 
component in Mpg. Further, since Mg is a tree and Sg is a subtree of MB, it 
holds for each vertex v’ in Sg that all vertices w with x(u’) = x(v’) form a 
weakly connected component in Sg. Thus, the phylogenetic tree S contains no 
two vertices u’ and v’ with x(u’) = x(v’). Since M is the result of contracting 
each arc (u,v) with x(u) = x(v) in Mpg, and since S is the result of contracting 
each arc (u,v) with x(u) = x(v) in Sg and since Sz is a subtree of Mp, it holds 
that S is a subtree of M. Concluding, $ is a single-labeled subtree of M, S 
has a binary resolution Sg, Sg has a subdivision $3, and S$ is by assumption 
isomorphic to TS, Thus, S softly displays T by definition. 

It remains to show that if M contains a single-labeled subtree $ which softly 
displays T, then M softly displays T. If M contains a single-labeled sub- 
tree S that softly displays T, then there are by definition binary resolutions Sg 
and Tg of $ and T, respectively, and subdivisions S$ of Sp and T$ of Tg 
such that $2 and T$ are isomorphic. We will show that M softly displays T, 
that is, there is a binary resolution Mg of M that has a subdivision M$ that 
contains T3 as a subgraph. First, to avoid ambiguity, we relabel each leaf 
that is not contained in S such that the resulting tree M’ is a single-labeled 
tree on a set X’ D X of labels. This allows us to again refer to leaves of M’ 
in terms of labels. Note that only labels for leaves not contained in S are 
different between M and M’ and hence M’ also contains S as a subgraph. 
Let Mp be any binary resolution of M” that satisfies the following property. If 
for three labels a,b,c € X it holds that LCA(a, b) <s, LCA(a,c) = LCA (b, c), 
then LCA(a, b) <m, LCA(a,c) = LCA(b,c). Note that Mg contains a subdi- 
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vision of Sg as a subtree. Hence, Mh firmly displays Tg. Finally, let Mg be 
the multi-labeled phylogenetic tree resulting from replacing the labels in Mh 
with their original labels from X. Since M and M’ only differ in these labels, it 
holds that Mp is a binary resolution of M. Further, since S does not contain 
any of the leaves in which Mg and Mj), differ, it holds that Mpg contains a 
subdivision of Sg as a subgraph. Thus, there is a binary resolution Mg of M 
and Mpg contains a subdivision of Sg as a subgraph which firmly displays T, 
that is, M softly displays T. 


We will use the characterization shown in Lemma 6.7 to prove both sides 
of the dichotomy result in this chapter. In Subsection 6.3.1, we present 
a k-SAT program for SOFT TREE CONTAINMENT on k-labeled phylogenetic 
trees. This implies that SOFT TREE CONTAINMENT is polynomial-time solvable 
for 2-labeled phylogenetic trees. In Subsection 6.3.2, we complement this result 
with a reduction from k-SAT to Sorr TREE CONTAINMENT on k-labeled 
phylogenetic trees. This implies that SOFT TREE CONTAINMENT on k-labeled 
phylogenetic trees is NP-hard for each k > 3. 


6.3.1 Reduction to k-SAT 


In this subsection, we present a k-SAT program for SOFT TREE CONTAINMENT 
on k-labeled phylogenetic trees. The basic idea is a bottom-up approach that 
computes for each vertex u in the single-labeled phylogenetic tree T a set M(u) 
of candidates. Each such candidate is a vertex v in the k-labeled phylogenetic 
tree N such that the subtree N, of N rooted in v displays 7, and for no 
descendant w of v it holds that N. displays Tu. We will later show that there 
are at most k such candidates for each vertex in T. Afterwards, we will show 
how to compute the set M(u) for each vertex u in T in a bottom-up manner 
using k-SAT. 

Note that if N displays T, then, by Lemma 6.7, N contains a single-labeled 
subtree S that displays T. We call S canonical for some vertex u in T 
if LOAS(L(T,)) € M(u) and canonical for T if it is canonical for all ver- 
tices in T. We start by showing that softly displaying is equivalent to having 
such a canonical subtree. 


Lemma 6.8. A k-labeled tree N softly displays a single-labeled tree T if and 
only if N has a canonical subtree for T. 
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Proof. Let r be the root of T. If N := (Vyn, An) has a canonical subtree S 
for T := (Vr, Ar), then, by definition, S contains a vertex v such that S, 
displays T, = T. Hence, N contains a single-labeled tree S that displays T and, 
by Lemma 6.7, this shows that N displays T. 

It remains to show that if N displays T, then N contains a canonical subtree 5 
for T. If N displays T, then N contains by Lemma 6.7 a single-labeled subtree S$ 
that displays T. Assume towards a contradiction that $ is not canonical for T. 
Let u € Vr be a vertex for which $ is not canonical but S is canonical for 
all ancestors of u in T. Note that u Æ r as S displays T = T, by assump- 
tion. Let p be the parent of u in T. Since S is canonical for p, there is a 
vertex y = LCAs(£(T,)) in S such that S, displays Tp. Let S} := Sy|cvr,), 
that is, S; is the subtree of S, containing all leaves in T, and no other. By 
Lemma 6.6(c), there is a binary single-labeled phylogenetic tree B on £(T,) 
which is displayed by S, and Tp. By Lemma 6.6(b), S, displays each triplet 
which is firmly displayed by B. Let x := LCAs(£(T,)). Since S is not canonical 
for u, it holds that S, does not display T,, or there is a descendant z of x such 
that S$, displays Tu. By definition of x, for no descendant z of x the subtree S, 
can display T, as for each such z there is a label ¢ € £(T;,) \ £(Sz) and therefore 
no triplet containing £ can be displayed by S,. Hence, Sy does not display Ta. 
Recall that there is a binary phylogenetic tree B which is displayed by Sy and Ty. 
Let B’ = Bl;(r,, and let ab|c be any triplet that B’ displays firmly. Since B’ 
is a subtree of B it holds that B firmly displays ab|c. Hence, Si, and Tp softly 
display ab|c. If T,, does not display ab|c, then, by Observation 6.3(b), it firmly 
displays ac|b or beja. Since T, is a subtree of Tp, also Tp firmly displays ac|b 
or bela. By Observation 6.3(b), Tp then does not display ab|c, a contradiction. 
Analogously, if S; does not display ab|c, then S, does not display ab|c, another 
contradiction. Thus, both S$, and 7, display all triplets that are displayed 
by B’, and Sz therefore displays T, by Lemma 6.6, yielding a final contradiction 
to the assumption that S is not canonical for u. 


As stated above, we compute M(u) for each vertex u in T in a bottom-up 
fashion. We will now show that |M(u)| < k for each u € Vr. 


Lemma 6.9. Let N be a k-labeled phylogenetic tree and let T := (Vr, Ar) be a 
single-labeled phylogenetic tree. Then, it holds for each u € Vr that |M(u)| < k. 


Proof. We prove the statement by induction over the height of a vertex u in T. 
If the height of u is 0, that is, u is a leaf, then M(u) contains all leaves in N 
that have the same label as u. As N is k-labeled, each candidate set M(u) for a 
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Figure 6.4: Two phylogenetic trees N (left-hand side) and T (right-hand side). The 
vertices in T are colored and for each vertex u in T all vertices in M(u) in N are 
colored with the same color as u. The ascending paths of the two red vertices in N are 
drawn with bold arcs and the ascending paths of the two blue vertices are indicated 
by dashed arcs. 


leaf u is of size at most k. If u is not a leaf, then let c be a child of u in T and 
assume that |M(c)| < k. Consider any vertex v E€ M(u). Since N, displays Tu, 
there is by Lemma 6.8 a subtree Sẹ, of N, that is canonical for Tu. Hence, there 
is a vertex w E€ M(c) in Sy, that is, Sw displays Te and w is a candidate for c. 
Note that v is the only ancestor of w in M(u) as M(u) only contains minima. 
Thus, any vertex in M(u) has a unique ancestor in M(c) and since |M(c)| < k, 
it holds that |M(u)| < k. 


Note that the proof of Lemma 6.9 also states that for each vertex u in T that 
is not a leaf, each child c of u in T, and each we M(c), there is at most one 
ancestor v of win N which is contained in M(u). We call the unique v-w-path 
in N the ascending path of w with respect to c and we omit mentioning c if it is 
clear from the context. An example of ascending paths is given in Figure 6.4. We 
next present a crucial lemma about ascending paths which states that ascending 
paths with respect to two vertices cı and ca are arc-disjoint unless cı and ca 
are siblings in T. Afterwards, we present our k-SAT program for SOFT TREE 
CONTAINMENT on k-labeled phylogenetic trees using the notions of candidate 
sets and ascending paths. 


Lemma 6.10. Let N be a multi-labeled phylogenetic tree, let T be a single- 
labeled phylogenetic tree, and let N display T. Let S be a canonical subtree of N 
for T. Let u and v be two distinct vertices in T such that neither of them is the 
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root of T and u and v are not siblings in T. Let LCAgs(L(Ty)) and LCAs(£L(T,)) 
have ascending paths R and Q with respect to u and v, respectively. Then, R 
and Q are arc-disjoint. 


Proof. To prove the statement, we distinguish between the two cases where u 
and v are in an ancestor-descendant relation or not. If u and v are in an ancestor- 
descendant relation, then without loss of generality let u <r v. Let p be the 
parent of u in T. Note that p <r v and hence LCAs(L(T,)) <s LCAs(L(T,)): 
Thus, each vertex in the ascending path R of u is either v or a descendant of v 
in T. Since the ascending path Q of v only contains v and ancestors of v in T, 
it holds that R and Q share at most one vertex (v) and no arcs. 

If u and v are not in an ancestor-descendant relation in T, then assume towards 
a contradiction that the ascending paths R and Q share an inner vertex z. Since z 
is an ancestor of both u and v in T, it holds that £(T,,) UL(L,) < L(T;). As u 
and v are not siblings in T, one of u and v has a parent p that is not in an 
ancestor-descendant relation with the other. Assume without loss of generality 
that p is the parent of u. Since v and p are not in an ancestor-descendant 
relation and since T is a single-labeled phylogenetic tree, it holds that 


L(T,) OL(Lz) 2 E(T.) #0 and L(L.) \ LT) 2 ET) #0. 


Since S is canonical, it holds that y := LCAs(£(T,)) € M(p) and, thus, the 
ascending path R starts in y. As z is an inner vertex of R, it holds that z <s y, 
implying 

L(Tp) \ £(Tz) #9. 


Concluding, it holds that 


L(Tp) O L(T:) #0, LT) \ LT) #0, and L(Tp) \ LT) #0 


and, by Lemma 6.5, this contradicts the assumption that S' softly display T. 


We next present the idea behind the main result in this section. To this end, 
let r be the root of T. Clearly, N displays T if and only if M(r) Æ 0. Hence, 
it remains to show how to compute M(u) given M(v) for all v Æ u in Tu. We 
do so via a reduction to k-SAT that checks for each y in N whether ye M(u). 
Therein, we have a variable x,_,. for each vertex c Æ u in T, and z € M(c) 
that represents whether S, displays Te, where S is the canonical subtree of Ny 
for Tą. The formula then checks whether these choices are consistent, that is, 
if tg, and x,z_,, are set to true and w is a descendant of v in T, then a is 
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a descendant of z in N (or a = z). Finally, the formula checks whether these 
choices satisfy Lemma 6.10. After presenting the formula, we will prove that it 
is correct, that is, Yy—u is satisfiable if and only if Ny displays Tu. 


Construction 6.11. Construct yy. as follows. For each v Zu in T, and for 
each z € M(v), introduce a variable x,-,,. Moreover for each v Æ u in Ti, 


(1) add the clause V ‚ce 1(u) T2405 
(2) for each pair 21,22 € M(v) of vertices, add the clause ~ £z v V 7@z5-40, 


(3) if the parent p of v in T, is not u, then for all z € M(v) and allge M(p) 
with z Ên q, add the clause 72,4, V "247, and 


(4) for each w Æ u in T, that is not a sibling of v, each z1 € M(v), and 
each z2 € M(w) such that the ascending paths of zı and za in N, share an 
arc, add the clause = zzv V "law: 


Note that the ascending path of z or q in (4) is not defined if v or w is a 
child of u in T, as M(u) is not defined. In this case we call the unique y-z-path 
or the unique y-q-path the ascending path as we test whether y € M(u). We 
next show that Construction 6.11 is correct. Since we use the construction 
to test whether y € M(u) and since M(u) can, by definition, not contain two 
vertices that are in an ancestor-descendant relation, we assume that Yz—u is 
not satisfiable for any descendant z of y in T. 


Lemma 6.12. Let u be a verter in T and let y be a vertex in N such that for 
each descendant d of y in N it holds that ~a+ is not satisfiable. Then, ~y+u 
is satisfiable if and only if Ny displays Tu. 


Proof. We start by showing that if N, displays Tu, then yy, is satisfiable. To 
this end, let S be a canonical subtree of N, that displays Tu. Note that S exists 
due to Lemma 6.8. Let 6 be a truth assignment for y,_,,, that sets each vari- 
able zzv to true if and only if z = LCAs(L(T,)). We will show that all clauses 
in Construction 6.11 are satisfied by this assignment. Note that for each v 4 u 
in T it holds that S, displays T, and z € M(v) where z := LCAs(£L(T,)). Hence 
each clause of type (1) is satisfied by 8. Moreover, since the LCA in S' is unique 
(as S is a tree), also all clauses of type (2) are satisfied by £. 

Assume towards a contradiction that a clause of type (3) is not satisfied. 
Then, there is some v with parent p in Tu such that y Éw z for some y € M(v) 


120 


and ze M(p) and 6(2,4.) = P(2:-p) = true. Since L(T,) D L(T,), it holds 
that y <s z. Moreover, since S is a subtree of N, it holds that y <yn z, 
contradicting y £n z. Thus, all clauses of type (3) are satisfied. 

Finally, if a clause of type (4) is not satisfied, then there are z,_,, and x; 
such that 


1. v and w are not siblings in T and neither of them is the root of T, 
2. B(tysv) = B(@z4w) = true, and 


3. the ascending paths of y = LCAs(L(T,)) and z = LCAs(L(T,)) in Ny 
share an arc. 


This contradicts Lemma 6.10 and therefore all clauses of type (4) are satisfied. 
Since each clause of Yy—>u is satisfied by 8, the formula is satisfiable. 

We next show that if yy. is satisfiable, then N, displays Tu. To this end, 
let 8 be a satisfying truth assignment for y,_,,. Let S be the subtree of Ny 
that contains y and all leaves z such that 6(x,,,) = true for some leaf v 
in T, (and no other leaves except for possibly y). We will show that S is 
canonical for Tą. To this end, we first show that S contains each vertex z such 
that (£z) = true for some vertex v in Tą. Note that yy-,, contains for 
each v Æ u in T, at most one vertex z such that (£z) = true as otherwise 
the respective clause of type (2) was not satisfied by 8. It also contains at least 
one such vertex as otherwise the clause of type (1) was not satisfied. For the 
sake of readability, we will denote this unique vertex z by (v) for each vertex v. 
As a special case, we define y(u) := y. We will show by induction over the 
height of v that ~(v) is contained in S and that Syw) displays T,. The height of 
a vertex v in a tree is the maximum distance between v and a descendant of v. 
If v is a leaf, then w(v) is by definition contained in S, and Syw) displays T,. 
If v is not a leaf, then let c be a child of v in Tą. Since c has smaller height 
than v in Ta, it holds by induction hypothesis that (c) is contained in S. 
If d(v) was not contained in S, then (v) is not an ancestor of ıb(c). This, 
however, contradicts the clause of type (3). Hence, each vertex (v) for some 
vertex v # u in Tu is contained in S. It remains to show that Syw) displays T,. 
Assume towards a contradiction that Syw) does not display Ts. By Lemma 6.5, 
there are vertices w in T, and q in Sy(w) and leaves 


a € L(Sq)\ L(Tw), b € L(Tw) \ £(Sq), and c € LTW)NL(S,). 


On the one hand, note that a <s q and c <s q and therefore there is a highest 
ancestor a of a in T with d(a) <s q and a highest ancestor y of c in T 
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with (y) <s q. By the definitions of a and y, there are parents pa and py 
of a and y, respectively, such that W(pa) £5 q and (p) Xs q. Hence, the 
ascending paths of w(a) and w(7), respectively, share q as an inner vertex and 
the arc (pq, q) where p, is the parent of q in S. Note that q has a parent as there 
is a leaf with label b that is not contained in S, but in Sy ,). On the other hand, 
note that a <r w and y Zr w, implying that a and y are not siblings in T, 
contradicting the assumption that all clauses of type (4) are satisfied by £. 


We next show our main result in this chapter, that is, a k-SAT program for 
SOFT TREE CONTAINMENT on k-labeled graphs for each k > 2. We mention 
that the program resulting from SOFT TREE CONTAINMENT on single-labeled 
phylogenetic trees contains clauses with two literals (the clauses of types (3) 
and (4) in Construction 6.11). Since 2-SAT formulas are linear-time solvable, 
the following result proves that SOFT TREE CONTAINMENT on single-labeled 
phylogenetic trees is polynomial-time solvable. In the paper on which this 
chapter is based, we also present a linear-time algorithm for SOFT TREE 
CONTAINMENT on single-labeled phylogenetic trees [BMW18]. 


Theorem 6.13. For each k > 2, one can decide in O(n? - k?) time whether a 
k-labeled phylogenetic tree N softly displays a single-labeled phylogenetic tree T 
using O(n) queries of size O(n? - k?) to k-SAT. 


Proof. The algorithm computes for each vertex u in T at most k vertices M(u) 
such that for each v € M(u) the subtree N, displays T,, and for no descendant w 
of v it holds that N. displays Tu. It computes this set M(u) bottom-up for 
each vertex u in T. The pseudo-code is given in Algorithm 6.1. All possible 
candidates for vertices in M(u) that are found by the algorithm are compared in 
Line 16 and all non-minima are removed. Hence, the set M(u) computed by the 
algorithm only contains minima. Hence, it remains to show that if for a vertex v 
in N it holds for no descendant w of v that N,, displays Tu, then ve M(u) 
if and only if N, displays Tu. Let v be such a vertex. Note that since for no 
descendant of w of v it holds that N,, displays T„, it holds by Lemma 6.12 
that ~w—+» is not satisfiable for any descendant w of v in N. Hence, Lemma 6.12 
states that Yy+, is satisfiable if and only if N, displays Ta. Thus, it remains 
to show that v € M(u) if and only if pvu is satisfiable. To this end, note 
that since N, displays T, it also displays Te for any descendant c of u in T. 
Let c be the child of u chosen in Line 6 and let v’ € M(c) be a descendant of v 
(or v’ = v). We now consider the iteration of Line 7 where w = v’. If v’ = v, 
then the algorithm adds v to M(u). If v’ # v, then note that v’ is a descendant 
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Algorithm 6.1: A k-SAT program for SOFT TREE CONTAINMENT on 
k-labeled phylogenetic trees. 
Input: A k-labeled phylogenetic tree N and a single-labeled 
phylogenetic tree T. 
Output: true if N displays T and false otherwise. 
1 r< root of T 


2 foreach vertex u in T do // in a bottom-up manner 
3 | M(u)+0 

4 if u is a leaf in T then M(u) + [ve N | L(v) = L(u)} 

5 else 

6 c + any child of u in T // c can be chosen arbitrarily 
7 foreach w € M(u) do 

8 w Hw 

9 while w 4 L do 

10 construct Pu 

11 if Yw+ is satisfiable then 

12 M(u) 4+ M(u) U {w’} 

13 wel 

14 else 

15 w + parent of w’ in T // If w =r, then w + L 
16 if Ja,be M(u). a <y b then remove b from M(u) 


17 if M(r) # then return true 
18 else return false 


of v and hence Yy/-+x is not satisfiable. The algorithm then iteratively tries each 
ancestor v* of v’ and checks whether y,«_,, is satisfiable. The formula Yy*—u is 
not satisfiable for each descendant v* of v and hence eventually 9, is tested. 
By assumption, yy +, is satisfiable and hence v is added to M(u). Thus, the 
set M(u) is computed correctly by Algorithm 6.1 for each vertex u in T. Finally, 
observe that N displays T if and only if M(r) 40 where r is the root of T. 

It remains to analyze the number and sizes of the constructed formulas and 
the running time of the algorithm. Note that all clauses of type (1) are of size 
at most k and all other clauses are of size at most 2. Hence for each k > 2 the 
resulting formulas are k-SAT formulas. We first analyze the size of each formula. 
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Note that there are O(n) clauses of type (1), O(n- k?) clauses of type (2) and (3), 
and O(n? - k?) clauses of type (4). Since only clauses of type (1) are not of 
constant size, each formula is of size O(n? - k?). 

Note that we construct at most one formula Pu. for each pair (v,u) of 
vertices where v is a vertex of N and u is a vertex in T. Hence, there are 
at most n? such formulas. It remains to analyze the running time of the 
algorithm (excluding the steps to solve the k-SAT formulas). The running time 
is dominated by the time to construct all formulas. Since we construct O(n?) 
formulas of size at most O(n? - k?), it remains to analyze the running time to 
construct each clause. Clauses of type (1) and (2) take constant time per literal. 
Clauses of type (3) and (4) take O(n) time to construct. Thus, the overall 
running time is O(n? - (n? - k?) - n) = O(n? - k?). 


A direct consequence of Theorem 6.13 is that SOFT TREE CONTAINMENT 
on 2-labeled phylogenetic trees can be solved in O(n?) time. This is a some- 
what surprising application of 2-SAT programming as it is not apparent that 
the difference between 2-labeled phylogenetic trees and 3-labeled phylogenetic 
trees and the difference between 3-labeled phylogenetic trees and 4-labeled 
phylogenetic trees should be very dissimilar. 


Corollary 6.14. It can be verified in O(n?) time whether a 2-labeled phyloge- 
netic tree N softly display a single-labeled phylogenetic tree T. 


We remark that this running time is not optimized and a more careful analysis 
using the amortized running time leads to a cubic running time [BMW18}]. 


6.3.2 Reduction from k-SAT 


In this subsection, we supplement the result from the previous subsection in 
the sense that we show that k-SAT reduces to SOFT TREE CONTAINMENT 
on k-labeled trees. As a consequence, SOFT TREE CONTAINMENT is NP-hard 
even when restricted to 3-labeled phylogenetic trees. To this end, we make a 
slight detour and first show a reduction from k-SAT to a rather technically 
looking version of INDEPENDENT SET that will turn out to be equivalent to a 
very natural variant of COLORFUL INDEPENDENT SET. From this variant of 
COLORFUL INDEPENDENT SET, we will then show a reduction to SOFT TREE 
CONTAINMENT on k-labeled trees. 

The mentioned variant of INDEPENDENT SET is based on the notion of Am B 
graphs. Therein, A and B are graph classes and a graph G = (V, Æ) isin Am B 
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if its edge set E can be partitioned into two sets FE, and Ea of edges such 
that Gı := (V, E1) is in graph class A and Ga := (V, E2) is in B [BBN19]. 
We are interested in the case where A is the disjoint union of P3’s and B 
is the disjoint union of cliques of size at most k. Disjoint unions of cliques 
are also known as cluster graphs. This leads to the following special case 
of INDEPENDENT SET. 


P œX CLUSTER INDEPENDENT SET 

Input: An integer @ and a graph G := (V,E) where E = E W E> such 
that G1 := (V, Eı) is a collection of disjoint P3’s and Ga := (V, E2) is 
a cluster graph in which each clique has size at most k. 

Question: Does G contain an independent set of size (? 


Van Bevern et al. [Bev+15] showed via a reduction from 3-SAT that In- 
DEPENDENT SET is NP-hard on A m B graphs! unless A and B only contain 
cluster graphs. We modify their reduction to be able to reduce from k-SAT. The 
basic idea is to represent each clause by a clique and each variable by a cycle of 
even length. The largest independent set can contain at most half of the vertices 
in each cycle and at most one vertex from each clique and it contains that many 
vertices if and only if the k-SAT formula is satisfiable. In the following, we 
denote the number of literals in a clause C by |C|. Note that we can assume 
without loss of generality that each variable occurs at most once in each clause 
as otherwise the clause is either trivially satisfied (if one occurrence is positive 
and the other negative) or one of the literals can be removed (if both occurrences 
are positive or both are negative). Moreover, we assume that each variable 
occurs at least twice in the formula as we can otherwise always satisfy the clause 
in which the variable occurs. 


Construction 6.15. Consider an instance y of k-SAT. Let y have n vari- 
ables 71, %2,...,%, and m clauses C1, C2,...,Cm such that each variable occurs 
at least twice in y and at most once in each clause. For each variable x; let J; be 
the list of indices of clauses that contain x; or — 2; and let J;[¢] denote the (*" 
element of this list. Construct a graph G := (V, Eı W E>) as follows. For each 
variable x; construct a cycle V; of 2|J;| vertices ul, ul,u2,02,...,u” al 
such that @* is adjacent to u® and ut! for each k € [|J;| — 1] and u! 


and al are adjacent. We call V; a variable gadget. For each clause C; 


lWe remark that A and B have to be closed under disjoint union and taking an induced 
subgraph. Moreover, at least one graph in A and one graph in B has to contain an edge. 
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Figure 6.5: Illustration of a small extract of the resulting graph of Construction 6.15. 
The triangle in the middle is the clause gadget for a clause of size three and the two 
cycles left and right are variable gadgets corresponding to variables that occur in this 
clause. The thin edges are contained in Eı and the bold edges are contained in Eo. 


that contains variables Lai, Zaz: ajo, Construct a clique that contains 
J 


i ajoj] 
vertices w5",w5?,...,w,; 7 


variable x; and each £ € [|J;|], connect wg to @ if Cj g contains x; and 


We call this clique a clause gadget. For each 


to uf if Cy, jq contains —x;. The edge set E consists of all edges between two 
vertices v and w where v is contained in a vertex gadget and w is contained 
in a clause gadget. Moreover, Æ, contains the edge {u*,u*} for each variable 
gadget V; and each k € [|J;|]. The edge set Es contains all constructed edges 


that are not contained in Fj. 


See Figure 6.5 for an illustration of Construction 6.15. We show that the 
graph Gı := (V, E1) consists only of disjoint P3’s. Note that Æ, contains all 
edges {u}, u} } and exactly one of the two vertices in {u¥,w*} is adjacent to 
a vertex in a clause gadget in G4. Since each vertex in a clause gadget has 
degree exactly one in G1, this proves that G only consist of disjoint P3’s. Next, 
observe that Ga := (V, E2) consists of disjoint cliques of size at most k. In each 
variable gadget it contains every other edge, that is, a matching (disjoint cliques 
of size two) and it contains all edges between vertices in clause gadgets which 
are by definition of size at most k. We next show that Construction 6.15 is 
correct. 


Lemma 6.16. Let y be an instance of k-SAT in which each variable occurs 
at least twice in p and at most once in each clause. Then, p is satisfiable if 
and only if the graph G := (V, E, © E2) resulting from Construction 6.15 has an 
independent set of size l where £ is the number of cliques in Ga := (V, Eb). 
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Proof. We start by showing that if G contains an independent set of size @, 
then ¢ is satisfiable. To this end, let J be an independent set of size @ in G. 
Note that I contains exactly one vertex from each clique in Ga and therefore 
for each variable gadget V; it either contains uj or u!. By construction of Vj, 
it holds that if ut € I, then uf € I for all £ € [|Jj|]. Analogously, if u! € J, 
then % € I for all £ € [|J;|]. We now describe how to construct a satisfying 
truth assignment for p. For each variable x;, we set B(z;) = true if uj € I 
and 6(x;) := false if ul € I. It remains to show that this truth assignment 8 
satisfies all clauses in y. To this end, consider any clause Cj. Since J contains 
exactly one vertex from each clause gadget (each such gadget induces a clique 
in Gy), it holds that wi € I for some i € [|C;|]. By construction, the variable x; 
occurs in C; (exactly once). If C; contains the literal = x;, then wi is adjacent 


to u? for some h € [|.J;|] and, since I is an independent set, I does not contain uf. 


Thus, u; € I and therefore 8(x;) = false and C} is satisfied by 8. If Cj contains 
the literal x;, then u% is adjacent to u}? for some h € [|J;|] and analogously u} € I. 
Thus, C} is satisfied by 8 as 6(a;) = true. Since each clause is satisfied by 6, 


this concludes the first direction of the proof. 


It remains to show that if ọ is satisfiable, then G contains an independent 
set of size k. Let ß be a satisfying truth assignment for p. We construct an 
independent set I of size £ for G as follows. For each variable x;, if 8(x;) = true, 
then I contains all vertices u? for h € [|J;|] and if B(x;) = false, then J contains 
all vertices @ for h € [|Ji|]. For each clause C}, let x; be a variable that 
satisfies Cj under assignment 8 and let J contain wi. Observe that I is of size £ 
as it contains exactly one vertex of each clique in Ga. It remains to show that I 
is indeed an independent set. Assume towards a contradiction that J was not an 
independent set. Then it contains two adjacent vertices. Note that it does not 
contain two adjacent vertices from variable gadgets as it contains every second 
vertex from the respective cycle. It does not contain two adjacent vertices from 
clause gadgets either as it contains exactly one vertex from each clause gadget 
and vertices from different clause gadgets are not adjacent in G. Hence, I 
contains a vertex wi from a clause gadget and a vertex v from a variable 
gadget such that v and wi are adjacent. If wi € J, then 2; satisfies Cj by 
construction, that is, 8(x;) = true if C} contains the literal x; and 6(x,) = false 
if C} contains the literal = x;. We distinguish between the two cases where C} 
contains the literal x; or the literal = x;. If C; contains the literal x;, then by 
construction wi is only adjacent to vertices in the clause gadget for C; and 
to TË for some h € [|J;|]. Since (x;) = true, it holds that u? € I and T}? ¢ I, a 
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Figure 6.6: A gadget for a P3 in Construction 6.17 where the inner vertex has a 
green color (the two upper leaves) and the end vertices have red (bottom left) and 
yellow color (bottom right), respectively. The triangles and squares represent leaves 
of different labels. There are six different labels in this phylogenetic tree (which are 
represented by a red square, a red triangle, a yellow square, a yellow triangle, a green 
square, and a green triangle, respectively). 


contradiction to the assumption that wi has a neighbor in J. If C} contains = 2;, 
then by construction w% is only adjacent to vertices in the clause gadget for C} 
and to uł for some h € [|Ji|]. Since 8(a;) = false, it holds that u? ¢ I, which is 
again a contradiction to the assumption that w has a neighbor in 7. Thus, I is 
an independent set which concludes the proof. 


Note that, by construction, the independent set has to contain exactly one 
vertex from each clique in Ga. This is equivalent to giving each vertex in G a 
color that represents in which clique in Ga the vertex is contained and asking 
for a colorful independent set, that is, an independent set which contains 
exactly one vertex of each color. Hence, Lemma 6.16 implies that k-SAT 
reduces to COLORFUL INDEPENDENT SET on disjoint P3’s where each color 
appears at most k times and no P3 contains two vertices of the same color. 
We next reduce this variant of COLORFUL INDEPENDENT SET to SOFT TREE 
CONTAINMENT on k-labeled trees. The basic idea is to construct a gadget as 
shown in Figure 6.6 for each P3 in the input graph and connect all of these 
gadgets by an arbitrary binary tree whose leaves are the roots of the respective 
gadgets. By Lemma 6.8, if this phylogenetic tree N displays T, then it contains 
a single-labeled phylogenetic tree S that displays T. We will show that S can 
contain either a leaf that represents the inner vertex in the respective P3 or only 
leaves that represent end vertices of the respective P3. Hence, we can use S to 
construct an independent set. 


Construction 6.17. Given a vertex-colored collection G := (V,E) of P3’s 
where each color occurs at most k times, we construct a k-labeled phylogenetic 
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Figure 6.7: Illustration of Construction 6.17. 

Left: The initial instance of COLORFUL INDEPENDENT SET on disjoint P3’s with 4 
colors (red, blue, green, and yellow) where each color occurs at most thrice. The 
encircled vertices represent a solution. 

Right: The single-labeled phylogenetic tree T resulting from Construction 6.17. 
Middle: The binary 3-labeled phylogenetic tree N resulting from Construction 6.17. 
The highlighted edges represent the single-labeled subtree S of N that displays T and 
that corresponds to the marked solution on the left-hand side. 


tree N and a single-labeled phylogenetic tree T as follows. Both phylogenetic 
networks contain two different labels 7; and ig for each color i in G. 

Construct T by first creating a star that has exactly one leaf of each color 
occurring in G. Then, for each leaf x with color 7, adding two new leaves labeled 
with 71 and i2, respectively. Since x is not a leaf any more, its label is removed. 

The k-labeled phylogenetic tree N is constructed as follows. We start with a 
gadget as shown in Figure 6.6 for each P = (u,v, w) in the input graph where 
red, green, and yellow denote the colors of u, v, and w, respectively. Therein, 
a triangle of color i represents a leaf labeled with i; and a square of color i 
represents a leaf labeled with ia. Finally, add an arbitrary binary tree that has 
a leaf for each P3 in G and identify each such leaf with the root of the respective 
constructed gadgets. 


An example of Construction 6.17 is given in Figure 6.7. We conclude this 
subsection with the proof that k-SAT reduces to SOFT TREE CONTAINMENT 


on k-labeled phylogenetic trees and a simple corollary that states NP-hardness. 


Proposition 6.18. k-SAT reduces for each k to SOFT TREE CONTAINMENT 
on binary k-labeled phylogenetic trees. 
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Proof. Note that k-SAT reduces by Lemma 6.16 to COLORFUL INDEPENDENT 
SET on disjoint P3’s where each color appears at most k times. We then apply 
Construction 6.17 to the constructed instance of COLORFUL INDEPENDENT SET. 
Since the resulting phylogenetic tree from Construction 6.17 is k-labeled and 
binary, it remains to show that this construction is correct, that is, N displays T 
if and only if the given collection G := (V, E) of P3’s has a colorful independent. 

We first show that if N displays T, then there is a colorful independent set in G. 
If N displays T, then, by Lemma 6.8, N contains a single-labeled phylogenetic 
tree S that displays T. By Observation 6.1, this is equivalent to T displaying S. 
Let Q be the set of vertices in G such that for each vertex v € Q of color c, S 
contains a vertex of color cı in the gadget for the respective P3 that v is in. 
Since S displays T, it contains a leaf with label cı and a leaf with label ca for 
each color c. Moreover, since S is single-labeled it contains exactly one vertex 
with label cı for each color c and therefore Q is colorful, that is, it contains 
exactly one vertex of each of the colors in G. Hence, it remains to show that Q 
is an independent set in G. Assume towards a contradiction that Q is not an 
independent set, that is, it contains two adjacent vertices u and w. Let without 
loss of generality u be an inner vertex in a P3 and let c and d be the colors of u 
and w respectively. Since u, w € Q, it holds that S contains the leaf with label cı 
and the leaf with label dı in the same gadget. By construction, S displays the 
triplet cıdı|c2 and T displays the triplet c,c2|d, firmly. By Lemma 6.6(b), this 
contradicts the fact that T displays S. 

We conclude the proof by showing that if G contains a colorful independent 
set I, then N displays T. To this end, let Z be a colorful independent set in G. 
We will show that there is a single-labeled subtree S of N that displays T. This 
implies by Lemma 6.8 that N displays T. For each vertex v € I of color c, let S 
contain the two leaves with labels cı and ca in the gadget for the respective Ps 
that v is in. Since I is colorful, S contains exactly one leaf of each label and it 
therefore remains to show that S displays T. 

Assume towards a contradiction that S does not display T. This is, by 
Observation 6.1, equivalent to T not displaying S. In this case, there is, by 
Lemma 6.6, a triplet xy|z that is firmly displayed by S but not softly displayed 
by T. By Observation 6.3(b), T then displays one of the triplets xz|y or yz|x 
firmly. Let T without loss of generality display xz|y firmly. By construction 
of T, it holds that y := cı and z := ca (or y := ca and z := c1) for some color c. 
By construction of S, it holds that the two leaves labeled with cı and ca in S 
are in the same gadget. Hence, cı and ca correspond to an inner vertex in 
the respective P3 as otherwise there is no label x such that S displays the 
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triplet cız|ca (or c2x|c1). By construction, I contains the inner vertex v of 
color c in the respective P3. Moreover, it holds that the leaf with label x is also 
contained in the same gadget and thus J contains one of the two end vertices in 
the same P; as v, a contradiction to the fact that J is an independent set. 


Since k-SAT is NP-hard for each k > 3, it holds that Sorr TREE Con- 
TAINMENT is NP-hard on binary k-labeled phylogenetic trees and, in particular, 
when restricted to 3-labeled phylogenetic trees. 


Corollary 6.19. SOFT TREE CONTAINMENT is NP-hard, even if the input 
network N is a binary 3-labeled phylogenetic tree. 


6.4 Concluding Remarks 


We initiated research into a practically relevant variant of TREE CONTAINMENT 
handling soft polytomies. We again defer the discussion on 2-SAT programming 
as a technique to the concluding chapter of this thesis and focus on SOFT TREE 
CONTAINMENT here. We laid the mathematical foundations to dealing with soft 
polytomies and showed the dichotomy result that Sorr TREE CONTAINMENT 
on k-labeled phylogenetic trees is polynomial-time solvable if k < 2 and NP-hard 
if k > 3. Further improving the running time of the polynomial-time algorithm 
for 2-labeled phylogenetic trees (e. g. within the context of FPT in P as done 
in Chapter 3) and empirically evaluating it on real-world data sets are clear 
avenues for further research. 

Motivated by our hardness result, the search for parameterized or approxi- 
mation algorithms is another logical next step. Previous work for TREE CON- 
TAINMENT [GLZ16, Well8] might lend promising ideas and parameterizations 
to this effort. 
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Chapter 7 


Reachable Objects 


In this chapter, we will investigate a problem from the widely-studied field 
of resource allocation under preferences, having applications in areas such as 
artificial intelligence and economics. Conceptually, we will develop a 2-SAT 
program where the truth assignment of a variable does not represent picking 
some element into a solution or not. It rather represents which of two elements 
is picked into a solution. These types of 2-SAT programs are so far very rare in 
the literature. We mention that the 2-SAT program we develop in this chapter 
does not meaningfully generalize to a k-SAT program and therefore the 2-SAT 
program is not a special case of a reduction to k-SAT. In Chapter 8, we will 
analyze the structure of the problem we study here and observe which structural 
elements enable 2-SAT programming. This will lead us to a rule of when 2-SAT 
programming can be a promising tool for solving algorithmic problems. 
Regarding resource allocation under preferences, we will investigate the 
REACHABLE OBJECT problem which generalizes the well-known HOUSING 
MARKET problem [SS74]. In REACHABLE OBJECT, agents are organized in 
a graph and two agents can only swap resources if they share an edge in the 
graph. This restriction models the situation where not all agents are able to 
communicate and swap with each other. We start with a dichotomy result 
regarding the number of objects each agent prefers over its initially held object 
and continue with investigating the special case where each agent has at most 
two neighbors in the graph. Using 2-SAT programming, we will show that this 
special case is polynomial-time solvable. The problem remains NP-hard for the 
case where each agent has at most four neighbors in the graph [SW1B8]. 
Resource allocation under preferences is a major topic in society and technol- 
ogy [Wall5]. It has also proven to be a key issue in a world of limited resources 
and allocating indivisible resources is well-studied in the context of multiagent 
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systems [BCM16]. It has numerous applications e. g. in contexts of food-banks, 
when sharing charitable donations between cities or communities, or when 
allocating physical to virtual resources in virtualization technologies [BKN18]. 
There are several versions studied in the literature that try to optimize for 
different criteria such as Pareto optimality, fairness, or social welfare [Abr+05, 
Rot82, SU1O]. 


In the field of resource allocation under preferences, one is interested in 
distributing a set of (divisible or indivisible) objects among a set of agents who 
value the objects differently. We focus entirely on indivisible objects here and 
consider the special case where each agent initially holds exactly one object. 
While a large body of research in the literature takes a centralized approach 
that globally controls and reallocates an object to each agent, we pursue a 
decentralized strategy where any pair of agents may locally swap objects as 
long as this leads to an improvement for both of them, that is, they both 
value the object they get over the one they give away [DBC15]. We are then 
interested whether there is a sequence of such rational trades that leads to a 
situation where a given agent obtains a given object. Other examples of recently 
studied problems regarding allocations of indivisible resources under social 
network constraints are envy-free allocations [Bey+19, BKN18], Pareto-optimal 
allocations [IP19], and stable matchings [ABH17, AV09]. 


The main contribution of this chapter is a polynomial-time algorithm for 
REACHABLE OBJECT on cycles and the following dichotomy result. If each 
agent prefers at most two other objects over the object it initially holds, then the 
problem is linear-time solvable. If some agents prefer more than two objects over 
their initially held object, then the problem is NP-hard. The polynomial-time 
reduction in the hardness result also shows that the problem remains NP-hard 
if the underlying graph is a clique, that is, all agents can pairwise swap with 
one another. It might be tempting to think that the hardness then stems from 
the density of the underlying graph as cycles are very sparse. This assumption 
is, however, false as the problem is known to be NP-hard even when the input 
graph is a tree [GLW17]. 


Section 7.2 is dedicated to the dichotomy result and in Section 7.3 we will 
present our 2-SAT-programming-based polynomial-time algorithm for REACH- 
ABLE OBJECT on cycles. Me mention that the positive result in the dichotomy 
part is based on dynamic programming. 
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7.1 Problem Definition and Related Work 


Let V := {1,2,...,n} be a set of n agents and let X := {x1,X3, ...,2„} bea 
set of n objects. Each agent i € V has a preference list over the objects in X, 
which is a strict linear order on X. This list is denoted as >; and we omit the 
subscript 7 if the agent is clear from the context. For two objects 7;,x¢, the 
notation xj >; xe means that agent i prefers x; over xe. A preference profile P 
is a collection (>=;);cv of preference lists of the agents in V. An assignment is a 
bijection o: V — X, where each agent 7 is assigned exactly one object o(i) € X. 
Since assignments are bijections, we will also use o~!(x;) to denote the agent 
that holds x; in assignment o. 

Let G := (V, E) be a graph where the set V of agents is the set of vertices. 
An edge in this graph models that two agents know and trust each other enough 
to swap objects. We say that an assignment o admits a rational trade for two 
agents i and j, denoted as T = {(i, a(2)), (7, 0(7))}, if the vertices corresponding 
toi and j are adjacent in the graph ({7, j} € E) and each of the two agents prefers 
the other’s assigned object over its own object (o(j) >; a(i) and a(i) =; 0(j)). 
After performing the swap specified by 7, agent i holds object o(j), agent j 
holds object ø(i), and the other agents keep their objects. To describe this move, 
we say that objects o(i) and o(j) are swapped over edge {i, j}. Sometimes, we 
also say that object o(i) (or a(j)) passes through edge {i,j} or moves from 
agent 2 to 7. 

A sequence of swaps is a sequence (00, 01,...,0:) of assignments where for 
each index k € {0,1,...,¢—1} there are two agents i, 7 € V for which og admits 
a swap T = {(i,0x(i)), (j,0r(j)) such that 


di ox+1(i) = arli), 


2. ok+1(j) = or(i), and 


3. On41(Z) = or(z) for each remaining agent z € V \ {i,j}. 


We call an assignment o’ reachable from another assignment o if there is a 
sequence (o0, 01, ..., 0+) of swaps such that oo = o and o; = o’. We say that 
an object x € X is reachable for an agent i from a given initial assignment oo if 
there is an assignment ø which is reachable from oo with o(i) = x. 

With these definitions at hand, we can now define the problem REACHABLE 
OBJECT introduced by Gourvés et al. [GLW17] which we study in this chapter. 
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liavg>aqg>[a,) 2:2, > 23> 44> | XQ 


[ \ 
O (2) 3:71 > £2 > £4 > T3 4: £5 > £3 > T4 
N / 


5:26 > £3 > T5 6 : £4 > £3 > T6 


Figure 7.1: An example of REACHABLE OBJECT. The six agents and the graph of 
agents are depicted on the left-hand side. The preference lists are depicted on the right 
and the initial assignment oo is illustrated by the boxes in the preference lists (each 
agent initially holds the object that is drawn in a box in the agent’s preference list). 
Since no agent will agree on receiving an object in any swap that it does not prefer 
over its initially held object, these objects are for the sake of readability not depicted 
in the preference lists. The agent I is agent 1 and x is the object x3. If the underlying 
graph was complete, then object x3 would be reachable for agent 1 within one swap. 
However, if the graph is a cycle as shown, then to reach agent 1 object x3 has to be 
swapped along {2,3} with object x2 first and then along {1,2} with object xı. Note 
that at both edges both incident agents agree to the swap as agent 3 prefers x2 over x3 
and agent 2 prefers x3 over x2 and xı over x3. Finally, agent 1 prefers x3 over xı. 


REACHABLE OBJECT 

Input: A set V of agents, a set X of objects with |X| = |V|, a preference 
profile P, an initial assignment oo, a graph G := (V, E), an agent I E€ V, 
and an object x € X. 

Question: Is x reachable for J from oo? 


An example of REACHABLE OBJECT is given in Figure 7.1. Note that an 
agent 7 that gives away a certain object x; during a sequence of swaps, obtains 
an object it prefers over x; and hence agent 7 will not accept object x; in the 
future. 


Observation 7.1. Let ¢:= (00,01,...,05) be a sequence of swaps, let i be an 
agent and let x; be an object. Ifo,(i) = x; and o,41(t) # x; for somer € [s—1], 
then o, (it) # x; for all r' >r. 


Concerning related work, Gourves et al. [GLW17] introduced REACHABLE 
OBJECT and showed that it is NP-hard on trees. Moreover, they showed 
polynomial-time solvability on stars and for a special case on paths, namely 
when testing whether an object is reachable for an agent positioned on an 
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end vertex of the path. Huang and Xiao [HX20] generalized this special case 
and showed that REACHABLE OBJECT on paths is polynomial-time solvable 
independently of where the target agent I is located on the path. They also 
considered a version where agents can value different objects equally (that is, 
the preference lists are not strict) and showed that in this case REACHABLE 
OBJECT remains NP-hard on paths. Saffidine and Wilczynski [SW18] studied the 
parameterized complexity of REACHABLE OBJECT with respect to parameters 
such as the maximum degree of the input graph or the overall number of swaps 
allowed in a sequence. They showed that REACHABLE OBJECT remains NP- 
hard even on graphs with maximum degree at most four. Further, they showed 
that REACHABLE OBJECT is W/1/-hard when parameterized by the length 
of the minimum sequence of swaps that leads to agent J obtaining object x. 
Finally, REACHABLE OBJECT is NP-complete on generalized caterpillars where 
each hair has length at most two and only one vertex has degree larger than 
two [Ben+19al. 


7.2 Length of Preference Lists 


In this section, we will show a complexity dichotomy result with regard to the 
maximum length of a preference list. Notice that each agent initially holds one 
object and it will never obtain any object that it does not prefer over its initially 
held object. Thus, we describe the preference list of an agent only up to its 
initially held object. The length of the preference list of an agent is then defined 
as the number of objects the agent likes at least as much as its initially held 
object and the maximum length of a preference list is the length of a longest 
preference list of any agent. 

The parameter maximum length of a preference list is mainly motivated by the 
following two scenarios. In many applications each agent only knows some of 
the objects (e. g. potential buyers usually only visited five to ten houses and do 
not like all of them or when ranking movies each participant has only seen some 
of the available movies) and in other applications even when all alternatives are 
known only a few of them are appealing (e. g. when applying for a job or when 
choosing food). Notably, Saffidine and Wilczynski [SW18] suggested to study 
REACHABLE OBJECT with restrictions on the preference lists. 

We will show in Subsection 7.2.1 that instances in which the maximum length 
of a preference list is at most three can be solved in linear time. We complement 
this result in Subsection 7.2.2 by showing that REACHABLE OBJECT is NP-hard 
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even if restricted to cases where the maximum length of a preference list is at 
most four and where the underlying graph is a clique. 


7.2.1 Maximum Length at Most Three 


In this subsection, we provide a linear-time algorithm for REACHABLE OBJECT 
when the maximum length of a preference list is at most three. The main idea is 
to reduce REACHABLE OBJECT to computing an s-t-path in a directed graph. 
Throughout this subsection, we assume that each agent i initially holds object x;. 
Consider all agents that hold the given target object x during a sequence of 
swaps that leads to agent J obtaining object x. All those agents except for 
the agent a that initially holds x and agent J must swap their initially held 
object to receive x and then receive their most preferred object for giving x 
away. We call those agents x-forwarder. Concerning agent J, it might swap 
its initially held object x; in order to receive x or it might first receive an 
object x, and then swap x, away in order to receive x. Note that in this case 
the preference list of agent I is £ > £w > |zı|. Since each preference list is 
of length at most three, the object £w is unique. Note that all agents that 
hold object x, in the mentioned sequence of swaps except for agents J and w 
must swap their initially held object to receive x, and then receive their most 
preferred object for giving x,, away. Analogously to x-forwarder, we call such 
agents 2 -forwarder. Hence, we basically just consider the case distinction 
whether agent J is a w-forwarder or not and which objects agent a and w receive 
in exchange for their initially held objects. We remark that there is a special 
case where a is an “,,-forwarder and agent w is an x-forwarder. Figure 7.2 gives 
an example of REACHABLE OBJECT where the maximum length of preference 
lists is at most three. 

Let (00, 01,...,0;) be a sequence of swaps. To ease the reasoning, we define 7; 
to be the swap that transforms o;_ı into g; . Formally, Tt; = {(j, Zp), (k,&g)} 
such that 


1. ai_-1(J) = cilk) = Tp, and 
2. oj-ı(k) = oilj) = Tq. 


Using this notation, we first prove a property which allows us to exclusively 
focus on the objects w and x. Roughly speaking, any solution can be partitioned 
into two sequences of swaps. In the first sequence, the object £w, which agent I 
swaps in exchange for object x, is swapped between each two consecutive 
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1:23 > £4 > [a1] 2:4, > z4 > [ro 


4 T 
(5) (2) 3: £2 > £4 > |a3| 4: £5 > 03>] 24 
\ 2 


5:% > %3>/%5| 6: z4 > X, >| Xe 


Figure 7.2: An example of REACHABLE OBJECT that is a slight modification of the 
example in Figure 7.1. Initially held objects are again drawn in boxes and the question 
is still whether 23 is reachable for agent 1. Then, our algorithm finds the following 
swap sequence for object x3 to reach agent 1: 46 3,3062,261,4065,5 66, 
6 + 1, where “i <> j” means that agents 7 and j swap the objects they currently hold. 
In this example each agent in {1,2,3} is an x4-forwarder and each agent in {4, 5,6} is 
an x3-forwarder. 


assignments. In the second sequence, object x is swapped between each two 
consecutive assignments. More specifically, the following lemma states that the 
sequence of swaps resulting from performing all swaps that involve £w and no 
other swaps leads to agent J obtaining £w. 


Lemma 7.2. Let 
(V = {1,2,... n}, X = {ti ts ain hP tine E (V, E), I, x) 


be an instance of REACHABLE OBJECT where oo(i) = x; for alli € |n] and where 
the maximum length of preference lists is at most three. Let d := (00,01, ..., 0t) 
be a sequence of swaps such that o,(I) = x. Consider two objects x, and xq such 
that there is a swap Tr with Tr = { (I, £p), (j, £q)}, that is, agent I obtains ob- 
ject x, in exchange for x, during &. Let T = {ri | Ti = {(j, £p), (k, £4) Ai < rH} 
be the set of all swaps between assignments in & up to assignment or that involve 
swapping tq. We denote the elements of T by T{,75,... > TIT] such that swap T! 
occurs before swap T; in @ for each i < j. Let Pstart = (00, 01, .--, 0%) be the 
sequence of assignments such that o) := 09 and o} is the result of performing 
swap T} in assignment T/_,. Let T; := {(ai—1, £q), (ai, tv, )} for each i € [s]. 
Then, 


(i) T = Tr, 


(ii) ao = q and agent q prefers £p, over £q, 
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(iti) as =I and agent I prefers xq over its initially held object xr, 


iv) for each z € |s — 1| agent a, has preference list £p., > £a >| Xp, | and is 
g q z 


z+1 


an xg-forwarder, 


(v) if agent I prefers x over xq, then no agent a, with z € [s — 1] prefers 
object x over its initially held object, and 


vi) if agent I initially holds xy, then Ostart is a sequence of swaps such 
g p 
that o} (I) = z4. 


Proof. We prove the individual statements one after another and we start with 
statement (i). To this end, note that by definition, the last swap T, between o,_ı 
and ø, contains (j, £4) for some agent j. Since the swaps between consecutive 
assignments in d’ are exactly those that involve swapping object xq, it follows 
that the swap between o/,_, and of must be 7, and thus T; = T, and statement (i) 
holds. 

We continue with statement (ii). By definition, 7] := { (ao, %q), (a1, Vp, )} and 
agent q initially holds object x,. By definition of rational swaps it holds that 
agent ao initially holds object x, and prefers xp, over xg. Since each object is 
unique, sind each object is only held by one agent at a time, and since both 
agents ao and q initially hold x4, it holds that aj = q and thus statement (ii) 
holds. 

To show statement (iii), recall that 7, = Tf = {(as—1, £q), (as, £p, )} and 
thus a, = J and since 7 is a rational swap, agent a, has to prefer x, over Zp. 
Since agent a, holds x, during ¢, it holds that x, = x, or that agent as 
prefers x, over £z. In both cases it prefers x over x; and thus statement (iii) 
holds. 

We next prove statement (iv). Assume towards a contradiction that there is a 
minimum z € [s— 1] such that a, does not have preference list x,,,, > £q >| £ 
or that it is not an x,-forwarder. Observe that if z = 1, then by state- 
ment (ii) a,_ı = a9 = q and agent a,_, initially holds x, and otherwise, 
since z is minimum, it holds that after swap r/_, agent az—ı holds object xq. 
Since T, = Tk for some k, it holds that x, >a, | Zp, | and agent a, swaps x», 


z 


for x, away. By definition of T},,, agents a, and a,;ı then swap x, and x, 
and thus Tb., >a, £q and a, is an x,-forwarder, a contradiction. 

We next show statement (v). Note that if agent I prefers object x over xq, 
then x Æ x,. Assume towards a contradiction that some agent a, with z € [s] 
prefers x over xz. Note that in this case x, # xp, as Xp, is the object that az 


z+1 
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initially holds. Then by statement (iv), it holds that a, only prefers x, and xp_,, 
over xp, and that a, obtains x;,,, during ® before agent J obtains object x. 
Since x, # x, it holds that x = x,,,, and since a, holds its most preferred 
object x, it will never trade this object away. Thus, object x cannot be obtained 
by agent J during ¢, a contradiction. 

Finally, to show statement (vi), notice that by statement (i) and the definition 
of 7, it holds that o{(J) = x, and hence it remains to show that start is a 
sequence of swaps. Assume towards a contradiction that Østart is not a sequence 
of swaps, that is, there are two consecutive assignments o;_, and ø} such 
that 7; is not a rational swap. Since by definition T; = Tk for some k, it holds 
that if r; = {(ai—1, £4), (ai, £b; )} is possible in the sense that a;—ı holds x, 
and agent a; holds x;,, then 7} is a rational swap. Assume without loss of 
generality that rj is the first swap between consecutive assignments in start 
that is not possible. Then, i = 1 or r/_, is possible. By statement (ii), in 
both cases agent a;_ı holds x, in oj. If i < s, then by statement (iv) agent a; 
can only trade zp, away in order to obtain x, in any trade 7 between two 
assignments o,—1 and ox in d. Since 7/ = Tp for some k € [r], it holds that r/ 
is possible, a contradiction. If i = s, then by statement (iii) agent a, is 
agent I which initially holds by assumption x, = x», and hence 7; is possible, a 
contradiction. 


We next present the main algorithm of this subsection and prove that it solves 
REACHABLE OBJECT when the maximum length of preference lists is at most 
three. Pseudo-code is given in Algorithm 7.1. The idea therein is to model 
possible swaps that involve x as arcs in a directed graph. Each arc (i, j) in this 
graph represents the fact that if agent i obtains object x, then it can swap it to 
agent j in exchange for object xj. A directed path from the agent that initially 
holds x to agent J then corresponds to a sequence of swaps such that agent I 
obtains object x in the end. We then consider the third object £w € {x7, x} 
which appears in the preference list of J and build a similar directed graph 
for £w. The directed paths from agent w to agent I in it again correspond to 
sequences of swaps such that agent I obtains object £w. 


Proposition 7.3. REACHABLE OBJECT can be solved in linear time when the 
maximum length of preference lists is at most three. 


Proof. Let 


(V := {1,2,...,n},X = {£1, £2, . .., £n}, P, 00, G = (V, E), I, x) 
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Algorithm 7.1: Algorithm for REACHABLE OBJECT when the maximum 
length of preference lists is at most three. 
Input :A set V of agents, preference lists (=;);ev of length at most 
three, and a graph (V, E). 
Output: true if agent J can receive object x that is initially held by 
agent a and false otherwise. 
Fe {(i,j) | {i,j} E EA; i CALS; 2 
// If agent i obtains x, then it can swap it to agent j for zj. 
D<+-(V,F) 
if D admits a directed path P from a to J then return yes 
if «>; £w >g | xı | for some xy # x then 
Fi + {(i,7) | {i,j} EC EA a; =i Lw A Ly >j £j} 
Fə + {(i, j) Jj AIAG CEA; >it NT >j x} 
F; + {(i, I) |{i,T}e EA ty >; x} 
// If i obtains object x, then it swaps it to agent I for x... 
Dı g= (V, Fi) 
Də + (V, F> U F3) 
10 if Dı admits a directed path from w to I and (w,a) € F then 


m 


Noa AON 


o œ 


// Object x is held by agent w after the first swap 
11 if Də admits a directed path from w to I then return true 
12 if Dı — {a} admits a directed path from w to J then 
13 if Də admits a directed path from a to J then return true 


14 return false 


be an instance of REACHABLE OBJECT where oo(i) = x; for all i € [n] 
and where the maximum length of preference lists is at most three. Let a 
be the agent that initially holds object x. We use Algorithm 7.1 to prove this 
proposition. We start with showing that if object x is reachable for agent J, 
then Algorithm 7.1 returns true. To this end, assume that there exists a se- 
quence ¢ = (00,01, ..., 0+) of swaps such that 0,(I) = x. We assume without 
loss of generality that o,-1(1) = x» # x. We then distinguish between the two 
cases £p = £r and a Æ £r. If x = x7, then using x, = x; and £4 = 7T, 
the sequence d’ = (04,0/,...,0,) as defined in Lemma 7.2 is a sequence of 
swaps such that of = oo and o/(I) = x. By Lemma 7.2(ii) to (iv), graph D as 


a” 8s 
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constructed in Line 2 must contain a path from a to J. Thus, Algorithm 7.1 
returns true in Line 3. 

If x, Æ xz, then the preference list of agent [is £n > £w > | xy |and £p = Ly. 
Moreover, agent J obtains £w during ¢ and thus there are o,_ı and oyp such 
that 7, = {{J, xr}, {k,zw}} for some agent k. By Lemma 7.2 (using x, = x7 
and tq = ty), the sequence ¢’ = (09,09,--.,0,) as defined in Lemma 7.2 is 
a sequence of swaps such that oh = oo and of (I) = £w. Let a1, a2,...,a5 be 
the agents that hold object £w during ¢’. It follows from Lemma 7.2(iv) and 
the definition of D; in Line 8 that & defines a directed path (ao, a1, a2,...,@s) 
with ao = wand a, = I in Dı. By Lemma 7.2(v), no agent in {a2, a3,...,Qs-ı} 
prefers x over its initially held object. By Lemma 7.2(ii) and (iv), it holds 
that Ti = {{w, £w}, {@1, a, }}. We then distinguish between the two cases a, = a 
and a; #a. If a; = a, then none of the agents in {a2,a3,...,a,—1} can 
be involved in a swap where x is traded. Observe that in this case con- 
sidering the initial assignment og with og(I) = tw, of = an of, = 2, 
and of (i) = ooli) for all other agents i is equivalent to the original instance. 
Using Lemma 7.2 with £p = x, and x, = x then states that there is a se- 
quence ¢* = (9},05,...,0%,) of swaps with of = og and 0% (I) = x. It follows 
from Lemma 7.2(iv) and the definition of Da in Line 9 that ¢* defines a directed 
path (bo, b1, b2...,0s:) with bo = w and by = I in D3. Thus, Algorithm 7.1 
returns true in Line 11. 

If aı Æ a, then none of the agents in {ao,a1,...,@s—1} can be involved in a 
swap where « is traded and agent a cannot receive £u during ¢. Hence, Di — {a} 
contains a directed path from w to J and Də contains a directed path from a 
to J. Thus, Algorithm 7.1 returns true in Line 13. 

We next show that if the algorithm returns true, then there exists a se- 
quence ¢ = (09, 01,...,0;) of swaps such that o;(2) = x. If the algorithm re- 
turns true, it does so either in Line 3, in Line 11, or in Line 13. If the algorithm 
returns true in Line 3, then let (ao := a,a1,..., at := I) be a directed path in D. 
Let c; be an assignment such that o;(a;) := oj-1(ai-1), oi(a;_ı) = 0i-ı(a;) 
and 0;(j) = o;-1(j) for all other agents j. By definition of rational trades 
and D, it holds that (09,01,...,04) is a sequence of swaps. Note that by 
construction o;(a;) = x for all i € [t] and thus (a) = v(I) = x. 

If the algorithm returns true in Line 11, then let (ao := w, a := ,...,@s = I) 
be a directed path in Dı and let (bo := w,bı,...,bı := I) be a directed path 
in Da. Let c; be an assignment such that 


1. o;(a;) = 0i-1(a;-ı) and o;(ai_ı) = 0i-1(a;) for all į € [s], 
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2. O54: (bi) = es+i-ılbi-ı) and Os+ilbi—1) := O5+i—1(0;) for alli € ft], and 


3. cilj) = oi-ı(j) for all i € [s + t| and all agents j that are not assigned 
objects by the above. 


By definition of rational trades and Dı and Dg, it holds that (00, 01,...,0s+t) 
is a sequence of swaps. Note that by construction o,4;(b;) = x for all i € [t] 
and thus Os+t(b4) = o544(L) = T. 

Finally, the case where the algorithm returns true in Line 13 is completely 
analogous for the two directed paths 


(ao = w, a1,...,đs = I) and (bp := a,b1,...,b = I). 


It remains to analyze the running time. We start with constructing D := (V, F) 
in O(n + m) time. The constructions of Dı and Də are analogous. To con- 
struct D = (V, F), we go through each edge {i, j} in the input graph and check 
in constant time whether agent 7 prefers xj over x and whether agent j prefers x 
over zj. All remaining steps are searches for directed paths in graphs. Using 
dynamic programming on the topological orders of the constructed DAGs, each 
of these steps can be computed in O(n + m) time. Thus, the overall running 
time is O(n + m). 


7.2.2 Maximum Length at Most Four 


Complementing the result from the previous subsection, we next show that 
REACHABLE OBJECT is already NP-hard when the maximum length of preference 
lists is four even if the input graph is restricted to be a clique. The hardness 
of cliques implies that the computational hardness of the problem does not 
stem from restricting the possible swaps between agents by an underlying social 
network. To show NP-hardness, we reduce from a restricted variant of 3-SAT. 
In this variant, called 2P1N-SAT (two-positive-and-one-negative SAT), each 
clause has either two or three literals, and each variable appears once as a 
negative literal and either once or twice as a positive literal. 2P1N-SAT is 
known to be NP-complete [Tov84]. 
We start with some intuition. Let 


® := (V = {v1,.--,Un},C = {C1,.--,Cm}) 


be an instance of 2P1N-SAT. The general idea of the reduction is to have a 
set of agents for each variable and one agent for each literal in a clause in ®. 
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The agents representing variables can then pass objects to agents representing 
an occurrence of this variable in a clause such that 


1. only agents representing positive literals or only the agent that represents 
the negative literal of this variable can receive these objects, 


2. an agent representing a certain literal in a clause can pass the target 
object x to an agent in “the next clause” if and only if it received a 
respective object from one of the agents representing the respective variable, 
and 


3. the target object x can reach the target agent I only if it passes through 
all clauses. 


Before we formally describe our construction, we first introduce some notation. 
For each variable v; € V, let occ(i) be the number of occurrences of variable v; 
(note that occ(i) € {2,3}), let v(i) denote the index of the clause that contains 
the negative literal = v;, and let mı(i) and ra(i) with m1(i) < m2(i) be the 
indices of the clauses that contain the positive literal v;. If occ(v;) = 2, then we 
simply neglect ma(i). For a clause Cj, we denote by |C;| the number of literals 
that C; contains. For each clause C} € C, we use an arbitrary but fixed order 
of the literals in C} to define a bijective function fj: Cj > {1,...,|Cj|}, which 
assigns to each literal contained in C; a distinct number from {1,2,...,|C;|}. 

We next give the formal description of the construction of Z. Afterwards we 
show how these definitions match the intuition we gave earlier and show an 
example of the construction. Afterwards, we formally prove the correctness of 
the construction. 


Construction 7.4. Let ® = (V = {uj,...,Un},C = {Ci,...,Cm}) be an 
instance of 2P1N-SAT. We construct an instance of REACHABLE OBJECT as 
follows. 


Agents and objects. For each variable v; € V, we define occ(i) — 1 variable 
agents U} (and U? if occ(i) = 3) and occ(i) — 1 objects x} (and x? if occ(i) = 3). 
For each clause Cj € C, we define 2|C;| + 1 clause agents Aj, Bj, and Dj, 
where z € [|C;|]. Moreover, we define 2|C;| + 1 objects 

TA IE T EE a 


Finally, there is a special agent J and a special object x. 
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Initial assignment and graph. For each i € [n] and each z € [occ(i)] 
agent U? initially holds object 2°“~*. For each j € [m] and each z € [|C;|] 
agent Bj initially holds object bf and agent Dj initially holds object dj. Finally, 
for each j € [m - 1], agent Aj+1 initially holds object aj, agent aı initially 
holds x, and agent I initially holds am. 

The graph G := (V, E) is complete, that is, E := (5). 


Preference lists. We next describe the preference list of each agent. Therein, 
we only specify the relevant part, that is, the preference list up to the object 
that the agent initially holds. We again mark the initially held object with a 
box. For a given variable v; € V, let 7 = v(i), j’ = mı(i), and if occ(i) = 3, 
then j” = 7o(i). If occ(i) = 2, then the preference list of U} is 


fjr (va) fil vi) 1 
di > d; >| 2; 


% 


and if occ(i) = 3, then the preference lists of U} and U? are 


a =r > ae *) S| 2? | and 


t 


2 


qi) o x? > | a} |, respectively. 


qu 


For j € [2, m], the preference list of A; is 


ee aj—1 |. 


Let for each z € [occ(i)] be £, the index such that f;(£,) = z. The preference 
list of B? is then 


T(C;, £2) >= T > Aaj. > bj 5 


where 
x}, if occ(i) =2 and ¢ = ~ v; for some variable v;, 
2 . - Bu ce 7 at . 
(0,0 = x7, if occ(i) = 3 and = ~ v; for some variable v;, 
? S . . . e . 
1 xl, if l= v; and j = m (i) for some variable v;, and 
x?, if €=v; and j = nz(i) for some variable v;. 


The preference list of Dj is 


aj > x > T(C;, l) >| de |. 
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The preference list of A; is 


b! = > > Os a], 


For each z € [occ(i)] let @, be the index such that fi(¢.) = z. The preference 
lists of Bf and Dj are 


7(C1, lz) > x >| dF | and 


a; > «> T(C1, lz) >| dj |, respectively. 


Finally, the preference list of agent I is x > | dm |. 


We next explain how these formal definitions follow the general idea we 
started with. To this end, note that if for two agents 7 and j there are no 
two objects x, and xg such that £k >i £e >i xi and £e =; Ek >j zj, then by 
definition of rational trades agents 7 and j will never swap objects. In this case, 
we say that the edge {i,j} is irrelevant and ignore the edges henceforth. All 
other edges are relevant and by carefully examining the preference lists of all 
agents, there are only the following relevant edges. 


(1) Relevant edges between clause agents representing one clause C} are for 
each z € [|C;|] 
{45 Bj}, {B}, D3}; 


that is, all clause agents for one clause form a subdivided star. 


(2) Relevant edges between clause agents representing two consecutive clauses C} 
and C;+1 are for each z € [|C;|] and 2’ € [|Cj41|] 


{D}, By yi}, 


that is, the two vertex sets {D7 | 1 < z < |Cj|} and {B7 Jl <2’ < |Cil} 
form a complete bipartite graph. 


(3) Agent I is adjacent to all clause agents Dz, for z € [|Cim|]}. 


(4) There are no relevant edges between variable agents except between two 
agents U} and U? that represent the same variable v; with occ(i) = 3. 


(5) Finally, for edges between the variable agent and clause agents, we distin- 
guish for each variable v; € V between occ(i) = 2 or occ(i) = 3. 
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Table 7.1: The preference lists of all agents for the instance V := {v1, va, v3} 
and C := {Ci = (v2 V v3), Ca = (v1 VaveaV7 v3), C3 = (= vı V v2 V u3)}. 
Ay: bi > b? > [a Ag: b} > b2 > b3 > : bd > b3 > b3 > 


Bt: 25 + x> |b BŁ}: z} > £ > a > A: xl > g> az > 


B?: z4 > x> |b? BŽ: £3 > £ > a > 


2: £2 > £> a> 


Bic nD Dre, 
53:23 = 2 >= ay 3:23 > T> ag 


Di: a > a> 23 >-|d}| D}: az > a> x} l: a3 >g >r 


= = 
= = 
D?: ay > a> ah > |d?| D3: ag> £> ch > 2: a3 > x > r2 > 
= = 


D3: a2 > £ > 2 3: a3 > x> 23 


I: x> |as 


Ul: d} > d} > |x} Ul: dt > x} > d > 2 U2: d? > z} > d3 > |z? 


Utd Sas a5 Uzi de ae lri 


(a) If occ(i) = 2, then the relevant edges between the variable agent U} 
representing v; and clause agents are 
BF (vi) Fa) vi) 
{U}, Bey} and {U}, Bre f: 


(b) If occ(i) = 3, then the relevant edges between the variable agents U} 
and U? representing v; and clause agents are 


Er E BO 

We now briefly describe how a solution in the constructed instance corre- 
sponds to a satisfying truth assignment of the original formula. Afterwards, we 
present the formal proof. Consider Table 7.1 and Figure 7.3 for an example 
of Construction 7.4, relevant edges, and how a satisfying truth assignment to 
the original formula corresponds to a solution for the constructed REACHABLE 
OBJECT instance. Note that only agents B7 and D# for j € [m] and z € [|Cj|] 
as well as agents J and A, prefer object x at least as much as their initially 
held object. By the analysis of relevant edges above, one can easily verify that 
agent I can only receive object x if for each clause Cj at least one agent Bj 
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Figure 7.3: An example of the agents and relevant edges re- 
sulting from Construction 7.4 for the instance V = {v1, v2, v3} 
and Cs {Ci = (v2 V v3), C2 = (v1 V ~ v2 V vs), C3 = (= vı V v2 V u3)}. The 
boxes with solid lines indicate the three clause gadgets and the three boxes with 
dashed lines display the three variable gadgets. Relevant edges between variable 
agents and clause agents are only drawn red for easier distinction. The preference 
lists are listed in Table 7.1. Notice that setting vı and v2 to false and v3 to true is 
a satisfying truth assignment. This corresponds to the following sequence of swaps 
that lets agent I obtain object x. Setting a variable that occurs thrice to true (v3 
in our example) is represented by the swap of the initially held objects of the two 
respective variable agents (in our case Uz and U? swap x? and x4). Afterwards we 
decide for each clause for one literal to satisfy this clause. Since in our example Ci 
and C2 are only satisfied by one literal, we can only choose between = vı and v3 
in C3. Let us choose v3 in C3. Next, for each clause C; the agent A; swaps with the 
chosen B-vertex and the chosen D-vertex swaps with the respective variable agent. 
In our case, Ai swaps with B?, Aa swaps with B2, Asa swaps with B3, D? swaps 
with U+, D2 swaps with U+, and DŠ swaps with U?. If all clauses are satisfied, then 
object x can be swapped “through all clauses”. In the sequence of swaps we described, 
object x it is held by agents Aı, B?, D7, B2, D2, B3, D3, and I. 


and D7 for z,z’ € [|C;|] held object x before. For agent D7 to pass object x 
to agent Bj,,, it has to receive a, in return. Since this object is initially held 
by agent A,+ı and since this agent only shares relevant edges with agents B7,, 
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for z € [|C;+1|], this works as a selection gadget of which literal in clause Cj+1 
should be satisfied. For agent Bj,, to then trade object x to agent Dj,,, 
it has to receive an object representing the corresponding variable in return. 
For Dj,, to receive such an object, it has to receive this from an agent in 
the corresponding variable gadget. If this variable only occurs twice, then the 
respective object can only be given to one D-agent and this agent will never 
give it away as it is its most preferred object. If the variable occurs thrice, then 
the agents preference lists are constructed in a way that if either of the “positive 
occurrences” are given an object representing this variable, then the “negative 
occurrence” cannot be satisfied. 

It remains to formally prove that Construction 7.4 is correct. This leads 
to the main result of this subsection which states that REACHABLE OBJECT 
remains NP-hard when each preference list has length at most four and which 
complements Proposition 7.3. 


Proposition 7.5. REACHABLE OBJECT is NP-hard even if the maximum length 
of preference lists is four and the input graph is restricted to complete graphs. 


Proof. Since each step in Construction 7.4 is polynomial-time computable, since 
all preference lists have by construction length at most four, and since the 
graph is complete, we will focus on showing that the constructed instance is 
equivalent to the original instance. To this end, let ® := (V,C) be an instance 
of 2P1N-SAT and consider the instance of REACHABLE OBJECT resulting 
from Construction 7.4. 

We will first show that if ® is satisfiable, then there is a sequence of swaps 
such that object x reaches I. Let 6: V — {true, false} be a satisfying truth 
assignment for ®. First, for each variable v; € V, if occ(i) = 3 and (vi) = true, 
then let agents U} and U? swap their initially held objects (so that U} and U? 
hold x} and «7, respectively). Second, identify for each clause C; one literal £; 
that satisfies C; under assignment 6. Then, perform the following swaps. 


1. Let agents A; and a (4) swap their initially held objects. 


(b; 


2. Let agents DË ) and U? swap their current objects such that 


(a) if 24; = ~ vi, then z = 1 (note that in this case agent U} is holding 
enue) 


i 


object x 
(b) if 2; =v; and j = mı (i), then z = 1 (note that in this case agent U} 
is holding object x1), and 
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(c) if 2; = vi and j = na(i), then z = 2 (note that in this case agent U? 
is holding object x?). 


After these swaps, agent Bil) holds object and agent B” () holds ob- 
ject aj—ı for each j € [2, m]. Moreover, for each j € [m], agent pe holds 
object r(C;,£;). Third, for each j € [m — 1] iteratively perform the following 
swaps. 


1. Agents Bu and pe swap their current objects (so that DË (4) 
holds object x afterwards). 

2. Agents D” () and Ber (443) swap their current objects (agent BBY (irn) 
holds object x afterwards). 


After these swaps, agent Bir (m) holds object x. Finally, agent Be (m) can 
swap object x in exchange for object T(Cm, lm) with agent Dir") who can 
then swap «x in exchange for object am with agent J. Thus, object x is reachable 
for agent J and the constructed instance is a yes-instance. 

For the other direction, assume that there is a sequence (00,01,...,0s) of 
swaps such that o,(I) = x. We show how to construct a satisfying truth 
assignment for ® using the following claim that formalizes the idea that object x 
has to pass “through all clauses”. 


Claim 7.6. For each clause C; € C, there exist assignments or and op and 
a literal L; € Cj such that 


1. (BE) = x, 
j (65 
2. (DE) = (C;, 4), 


3: ong (BY) = T(C;, li); and 


4. ores (DE) =". 


Proof of Claim 7.6. We prove the claim by induction over j, starting with j = m. 
In the initial assignment oo, agent I holds object am. Note that J prefers only x 
over its initially held object am and only the agents DZ, with z € [|Cm]] prefer x 
over am. Hence, J has to swap with one of these agents to obtain x. Let £m be 
the literal with fm (m) = z. In order for agent DŽ, to obtain object x to swap it 
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to I, it must hold object T(Cim,m) and swap it for x since no agent will trade x 
for dz,. Observe that agent BŽ, is the only agent that prefers T(Cm, lm) over x 
and BŽ, must therefore trade object x to DŽ, in exchange for object T(Cm, lm). 
Thus, there are assignments o, and 0,4, such that 


1. G,( BZ) =, 
2. (DZ) = DORA 
3. or+1(B3,) = T(Cm, lm), and 


4. oryi( DZ) = a. 


We now show that if for some j € [m — 1] there are assignments or and oy.+1 
and a literal lj41ı that fulfill the claim for clause C;,1, then there are also 
assignments o, and o,,, and a literal £; that fulfill the claim for clause C}. By 
(+1) must have obtained object x at some point. Since it 


prefers x only over objects a, and bo (4541) (it) 


iti fisi 
definition, agent B;*; ’ 
. iți 
and since no agent prefers 651" 


over x, it follows that agent a (+1) must have swapped object a; with some 
other agent for x. Since only agents from {D7 | z € [|Cj|]} prefer a; over x, it 


follows that Ben (+1) must have swapped with some agent De with 2’ € [|C;|] 
to obtain object x. Let 4; be the literal such that f,(£;) = 2’. Now consider how 
agent De can obtain object x. Similarly to the case with agent DŽ., agent D7 
must swap object 7(C;,¢;) with agent Be to obtain x as no agent prefers d7 
over x. Thus, there are assignments o, and o,+1 such that 

a ag, 

(Der, 


3. on41( Bi) = T(O,,;), and 


4. (DE) = 2. o 


We conclude the proof by constructing a truth assignment 8 and show that 
it satisfies ® using Claim 7.6. Let for each variable v; € V be 


v(i) 
true, otherwise. 


false, if DOC") swapped object x with BOC ”) 
B(v4) = v(i) 
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Assume towards a contradiction that 8 does not satisfy ®, that is, there is 
some clause Cj € C that is not satisfied by 6. By Claim 7.6, let 0; € Cj be 
a literal such that pee swapped object x with Be for object 7(C;, £;). 
Observe that ¢; € {vi, vi} for some v; € V. We now distinguish between 
the two cases 4; = v; and 0; = ~ v;i. If 2; = v;, then notice that j = v(i) 


since each variable occurs exactly once as a negative literal. Thus, D Cva 
swapped object x with ae oH By construction, 3(v;) = false and thus C} 


is satisfied, a contradiction. 
If 2; = v;, then v; € Cj. Since C} is not satisfied by £, it holds that = v; € Cj 


and 6(v;) = false. It then follows from the construction of 6 that De = 
~ *) Note that, by the construction of the 


can only have given object T(C,(«),— vi) for x in this 


swapped object x with BÍ 


vl vi) 
(i) x 
trade. We will show that this contradicts the assumption that pr (63) swapped 


: Fi 
preference lists, B;, 


object x with prn for object 7(C;,¢;) using a case distinction over occ(i). 

If occ(i) = 2, then the definition of 7 yields (C,,v;) = 7(Cy, 7 vi) = z}. 
Since both agents g and aa va) prefer object x} the most, once one of 
the two agents received it, the same object cannot be used to be swapped to the 
respective other agent. Thus, not both of the constructed swaps can happen 
during the sequence of swaps, a contradiction. 

If occ(i) = 3, then by the definition of r it holds that r(C,(),7 vi) = 2}. 
2 vi) 
(as U? and poe ®) share no relevant edge). Thus agents U} and U? did 
not swap their initially held objects. We make a final case distinction on 
whether j = mı(£,;) or j = 72(¢;). If j = mı(£,), then agent pe must have 
(£5) 


Since agent DÍ received object 2?, it must have received it from agent U} 


received object x! from agent U}. If j = na(l;), then agent DË must have 
received object x? from agent U?. In both cases agents U} and U? swapped 
their initially held objects, a contradiction. 


This concludes the dichotomy result for the maximum length £ of preference 
lists as Proposition 7.3 states that REACHABLE OBJECT is linear-time solvable 
if 2 < 3 and Proposition 7.5 complements this result by showing that REACHABLE 
OBJECT remains NP-hard for £ = 4. Note that if we replace the complete graph 
in Construction 7.4 by the graph that only contains the relevant edges, then, for 


153 


each £ > 4, we can simply add a new agent with an arbitrary preference list of 
length £ that is only adjacent to agent J. This agent can never swap its initially 
held object o since I does not prefer o over its initially held object. This implies 
NP-hardness of REACHABLE OBJECT with respect to the maximum length £ of 
preference lists for each £ > 4. 


7.3 Cycles 


In this section, we prove that REACHABLE OBJECT on n-vertex cycles is solvable 
in O(n*) time. This generalizes an O(n*)-time algorithm for REACHABLE 
OBJECT on paths by Huang and Xiao [HX20]. The main difference between our 
algorithm and the algorithm by Huang and Xiao is the fact that two objects can 
only be swapped once in a path but up to twice in a cycle. The main ingredient 
to overcome this obstacle is a structural observation which states that for each 
solution there is a constant c such that the following holds. For all pairs (x;, x;) 
of objects that are swapped twice in the solution, it holds that the two edges 
over which x; and x; are swapped have distance c in the input cycle. Since 
this constant c is the same for all pairs of objects, we will first determine the 
value of c and then use it to check for each pair of objects whether they can be 
swapped twice in a solution. 

Note that we can ignore all connected components in the input instance of 
REACHABLE OBJECT that do not contain J and hence we may assume that the 
input graph is connected. Note further that any connected graph with maximum 
degree two is either a path or a cycle. Thus, our algorithm for cycles and the 
algorithm for paths by Huang and Xiao [HX20] prove that REACHABLE OBJECT 
is polynomial-time solvable for graphs of maximum degree two. Saffidine and 
Wilczynski [SW18] showed that REACHABLE OBJECT remains NP-hard on 
graphs of maximum degree four. The general idea for our algorithm is as follows. 
Note that Observation 7.1 implies that there are only two possible paths of 
agents in the graph that can hold the target object x before the target agent I 
can obtain it. We will then guess! the path of agents that hold x during a 
solution (a sequence of swaps such that agent J obtains x). This will allow 
us to represent a solution by selecting one object to be swapped with x over 
each edge in the guessed path. An example of this is given in Figure 7.4. In 
Subsection 7.3.1, we show that for each edge in this path there are at most two 


lAs in Chapter 5, guessing refers to the procedure of iterating over all possibilities and 
considering “the correct” iteration for the proof. 


154 


(5) 9 0: £5 > £4 > £3 > |xzo| 1:23 > z4 > 205 > [a 


| | 2:75, > gi > ts |2| 3:29 ta = | T3 


(4) (2) 4: £5 > 23 > [a4 5:29 > £3 > | x5 


Figure 7.4: An example of REACHABLE OBJECT on a cycle. Initially held objects are 
drawn in boxes and the question is whether x4 is reachable for agent 0. Note that agent 5 
does not accept object x4 and hence object x4 has to pass the edges {3, 4}, {2,3}, {1, 2}, 
and {0,1} before agent 0 can obtain it. Considering the preference lists of the agents, 
it is easy to verify that only object x3 can be swapped with x4 over the edges {3,4} 
and {0,1}. Analogously, only object x2 can be swapped with x4 over the edge {2,3} 
and objects x; and x; are candidates for being swapped with x4 over the edge {1,2}. 
Observe that for object xs to be swapped with x4 over the edge {1,2}, it has to be 
swapped over the edge {0,1} which is impossible as agent 0 does not prefer any object 
over 25. Hence, the only solution selects objects 11, x2, and x3 to move clockwise 
and objects xo, v4, and x5 move counter-clockwise. Note that the sequence of swaps 
resulting from 3 4 4,46 5,560,346 2, 2 & 1, 1 & 0 leads to agent J obtaining xa. 
Therein, “i + j” means that agents 7 and j swap the objects they currently hold. 


candidate objects that can be swapped with x over the respective edge. Finally, 
in Subsection 7.3.2, we show how to partition the edges in the path such that 


1. for each part of the partition there are at most two possible choices for 
selecting a candidate for all edges in the respective part and 


2. candidates for two different parts are either incompatible, that is, there is 
no solution for the overall problem that uses the respective candidates, or 
they can be combined independently of the choices of candidates for other 
parts. 


We conclude with the main theorem that states that REACHABLE OBJECT 
on cycles can be solved in O(n*) time. The respective algorithm is a 2-SAT 
program with a variable for each part of the described partition. The truth 
value of this variable represents the choice of candidates for each edge in the 
respective part. The clauses will guarantee that no two incompatible candidates 
are chosen. 
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Figure 7.5: A cycle with six vertices. The part [2, 4] is colored violet (darker) and [4, 2] 
is colored yellow (brighter). 


We start with some notation for this section. For the sake of readability, we 
assume that the graph is 


G:= (V = {0}U[n—-1], E = {fi-1,i} | i € [n— 1]} U {{0,n-1}}). 


Furthermore, if we refer to some agent j with j € {0} U [n — 1], then we mean 
agent j’ with 7’ = j (mod n). For each object x;, we denote by A(a;) the agent 
that initially holds x;, that is, og '(a;). We assume without loss of generality 
that I := 0 and refer to the target object as x and define k := A(x). 

We use |i, j] to denote the set {i, i + 1 mod n,...,7 mod n}, that is, 


li, j] = li, j], if j > i, and 
? (0, j)U[én—1] ifj <i. 


See Figure 7.5 for an example. Finally, we say that an object x; moves clockwise 
if it is swapped from some agent i to agent i+ 1. Analogously, we say that x; 
moves counter-clockwise if it swapped from some agent i to agent i—1. By 
Observation 7.1, an object moving clockwise (or counter-clockwise) once, will 
only move clockwise (or counter-clockwise) in the future. 


Observation 7.7. Let ¢ be a sequence of swaps and let x; be an object. If x; 
is swapped during ¢, then it either only moves clockwise or only moves counter- 
clockwise during ©. 
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Note that the object x has to move clockwise or counter-clockwise in a solution. 
Our main algorithm just tries out both possibilities one after another and since 
these two cases work analogously, we will only present the case where x moves 
counter-clockwise here. Since x moves counter-clockwise and is initially held by 
agent k := A(x), if there is a solution (a sequence of swaps such that agent I 
obtains object x), then x is swapped over each edge in {{i — 1,7} | i € [k]}. 
Moreover, we can assume that x is swapped over the edge {0,1} in the last 
swap of the solution as all swaps afterwards are irrelevant. Our algorithm 
guesses the object z with which object x is swapped in this last swap. Note 
that there are two possibilities for x moving clockwise or counter-clockwise and 
at most n possibilities for choosing z. Hence, there are O(n) iterations in our 
main algorithm and we can assume that 2 moves counter-clockwise and that 
object z is known. We will use this assumption throughout this section. 


Assumption 7.8. Let T = (V,X,P,00,G,0,x) with 
G=(V = {0} UR-1,2={fi-1,} |hem-1}u{ln-1})) 


be an instance of REACHABLE OBJECT on cycles. If x is reachable for agent 0, 
then there is a solution in which x moves counter-clockwise and in the last swap 
of the solution it is swapped with object z over the edge {0,1}. 


We continue with an analysis of how often objects can be swapped in a cycle. 
To this end, we first show a helpful lemma which states for two objects x; and x; 
that are swapped in a sequence of swaps which other objects are swapped with 
either of them before x; and x; can be swapped. 


Lemma 7.9. Let xp, xi, and x; be three distinct objects. Let 6 = (00,01,..-,0t) 
be a sequence of swaps such that x; and x; are swapped between o,_ı and or 
and x; # x moves clockwise ind. Let r <t—1 such that x; and xj are not 
swapped between os-ı and os for any s € [r+1,t—1]. Then, object £n is 
swapped with either x; or x; between o,_ı and os for some se [r + 1,t — 1] if 
and only if o,'(zn) € or! (xi), o; '(z5)]- 


Proof. Note that since x; and x; are swapped in @ and since x; moves clockwise, 
it holds that x; moves counter-clockwise in ¢. We prove the claim by induction 
over |[o, "(2;),0; '(z;)]|- If [loz (24), 07" (x;)]| = 2, then x; and x; can only 
be swapped over the edge {o,!(z;),0, '(x;)} and hence no other object can be 
swapped with either object before x; and x; are swapped. Since no object xp 
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other than z; and x; fulfills o71(x;,) € lo; (xi),0,'(x;)], this concludes the 
base case. 

Now assume the statement holds for all objects xy and x, such that x; moves 
clockwise, x; moves counter-clockwise, and |[A(z), A(x;)]| < |[A(a:), A(z)]]- 
Take any object x, such that A(x) € [A(z;), A(z;)] \ {zi,7;}. We assume 
without loss of generality that x; moves counter-clockwise in & as the other 
case is analogous. Note that x; and x, are swapped in ¢ as otherwise x, would 
always stay “between” x; and x; and hence x; and x; could not be swapped 
in d. By induction hypothesis, if x; and xg are swapped between o,_ı and gs for 
some s > t, then x; and x, are not swapped between o;_ı and o+, a contradiction. 
Hence x; and we are swapped before x; and x; are swapped, that is, there is 
some se [r+1,¢— 1] such that x; and x, are swapped between o,_ı and o,. 

It remains to show that no object x, with A(z) € [A(2;), A(z;)], is swapped 
with x; or x; before x; and x; are swapped. This follows from a simple 
counting argument. There are |[o,"(z;),0; '(z;)]| — 1 edges between o, !(z;) 
and o} '(a;). The two objects x; and x; are swapped over one of these edges. 
Over each of the other edges exactly one of the objects is swapped before x; 
and x; are swapped. Thus, |[o-'(2;),0;'(z;)]| — 2 objects are swapped with 
either x; or zj before x; and x; are swapped. As shown above, each agent xp 
with A(x) € [o7 (ai), 07 (x;)] and zn € {x;,2;} is swapped with x; or £j 
before x; and x; are swapped. The counting argument is then completed by 
observing that there are |[o, '(x;),0,'(z;)]| — 2 such objects. 


For an example of Lemma 7.9, recall Figure 7.4. Therein, object x3 moves 
clockwise, object xp moves counter-clockwise, and objects x4 and x; are initially 
held by agents in [3,0]. Lemma 7.9 states that objects x4 and x; are swapped 
with xo or x3 before objects zo and x3 are swapped and objects xı and x2 are 
not swapped with xo or x3 before £o and x3 are swapped. Lemma 7.9 has three 
interesting implications. First, it implies that each pair of objects is swapped 
at most twice. Note that the example in Figure 7.4 shows that two objects 
(a3 and z4 in the example) can be swapped twice in a cycle. To verify that 
each pair of objects is swapped at most twice, consider two objects x; and x; 
and the assignment o, after x; and x; are swapped for the first time over an 
edge {£,£+1 mod n}. Lemma 7.9 then states that each object xp, (except for x; 
and x;) have to be swapped with either x; or x; before x; and x; can be swapped 
for a second time. Thus, each agent has to hold x; or x; between the two swaps 
of x; and x, as for each of the n — 2 objects that are swapped with x; or 2; a 
new agent holds x; or xj. Since after the second swap of x; and x; agent A(x;) 
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has held x; and xj, it will by Assumption 7.8 and Observation 7.1 not accept 
either of the objects again, so x; and x; cannot be swapped thrice in a cycle. 


Corollary 7.10. Each pair of objects can be swapped at most twice in a cycle. 


The second interesting implication of Lemma 7.9 is that if A(z) € [k, I], then 
each object x; with A(z;) € [k, A(z)] except for x and z is not swapped with x 
or z before x and z are swapped. Moreover, no such object can be swapped 
with another object that is then swapped with x or z. Notice that in this case 
agent I holds object z before x and z are swapped and thus x and z are by 
Observation 7.1 only swapped once. Hence, an object x; with A(x;) € [k, A(z)] 
does not have to be swapped at all and thus no two objects have to be swapped 
over the edge {k,k + 1}. We can therefore remove the edge from the cycle 
to obtain a path and use the algorithm by Huang and Xiao [HX20]. For the 
remainder of this section, we will therefore assume the following. 


Assumption 7.11. A(z) € [J+1,k] 


The third implication of Lemma 7.9 concerns how often an object is swapped 
with x or z. Note that each object moving clockwise is swapped with x (as all 
objects are swapped with x or z between the first and second swap of x and z 
and an object moving clockwise can never be swapped with z). Analogously, 
each object moving counter-clockwise is swapped with z. Thus, Lemma 7.9 
implies the following. 


Observation 7.12. Each object x; with A(x;) € [A(2), k] is swapped exactly 
twice with x or z and each object x; with A(a;) € [A(z),k] is swapped exactly 
once with x or z. 


Note that for each edge e € {fi — 1, i} | i € [k]}, there is exactly one object 
that is swapped with x over e and this object moves clockwise. Moreover, by 
Observation 7.12 each object moving clockwise is swapped with x over one 
of these edges. Thus, we can characterize a solution by choosing for each 
edge e € {fi —1,i} | 7 € [k]} one object to move clockwise and be swapped 
with x over e. 


7.3.1 Limited Number of Candidates 


In this subsection, we will show that once object z is fixed, there are for each 
edge e € {{i— 1,7} |i € [k]} at most two candidate objects c1, c2 such that x is 


159 


swapped with either cı or ca over e. We start with a series of helpful lemmata 
that will be used often throughout this subsection. The first lemma states that 
for each pair (x;, xj) of objects and each agent @, the edge where x; and x, are 
swapped for the first time is the same in all sequences of swaps where x; moves 
clockwise, x; moves counter-clockwise, and agent £ holds both x; and x; during 
the sequence of swaps. 


Lemma 7.13. Let x; and x; be two objects and let L € [A(z;), A(a;)] be an 
agent. There is an edge e such that for each sequence of swaps & such that x; 
moves clockwise in @, object x; moves counter-clockwise in @, and agent £ holds 
both x; and x; during 9, it holds that x; and x; are swapped over e during &. 
Deciding whether such an edge exists and computing it if it exists takes O(n) 
time after an O(n?)-time preprocessing step. 


Proof. We distinguish between the two cases x; >¢ xi and x; >e xj. Since 
both cases are completely analogous, we only show the proof for the former 
case. Note that agent £ must then first hold object x; before it holds object x; 
as it would otherwise not accept object x; after already holding x; or an 
object it prefers over xj. Thus, objects x; and x; must be swapped between 
two agents in [¢,A(z;)]. Now iteratively consider the preference list of an 
agent l € [@, A(x;)] (starting with agent ¿+1 mod n). If agent €’ also prefers xj 
over x;, then x; and x; cannot be swapped over the edge {f — 1, ¢’}. Hence, 
agent l must also hold object x; before it holds x; and we can continue the 
argumentation until we either find an agent who prefers x; over x; or we 
reach agent A(x,;) and A(x;) also prefers x; over x;. If we reach agent A(x;) 
and A(x;) also prefers x; over x;, then all agents in [£, A(x;)] prefer x; over x; 
and hence these two objects cannot be swapped between two such agents. If 
agent l’ prefers x; over £j, then these two objects can only be swapped over the 
edge {V — 1, &’} as shown next. Assume towards a contradiction that x; and a; 
were swapped over another edge {h — 1,h} where he [V + 1, A(a,;)]. Then, 
agent ¢’ has to pass object x; towards agent h before agent €’ holds object xj. 
This means, however, that agent ¢’ will not accept object x, as it prefers x; 
over xj. Thus, object x; cannot be passed to agent £, a contradiction. 

It remains to analyze the running time of the algorithm. We first describe a 
simple preprocessing step that eases the computation of deciding which of two 
objects an agent prefers. We define pos(i, xj) as the position of object x; in the 
preference list of agent i. Note that pos can be precomputed once in O(n?) time 
by iterating over the preference list of each agent. Once the preprocessing is done, 
we have to (in the worst case) check for each agent €’ € [[é, A(x;)] whether they 
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prefer x; over x; or not. Since this is only a check whether pos(¢’, x;) < pos(¢’, xj) 
or not, the whole procedure takes O(n) time in total after preprocessing. Note 
that the preprocessing can be done once and then be reused for each application 
of Lemma 7.13. 


Based on Lemma 7.13, we define the set of edges where two objects x; and 2x; 
can be swapped. 


Definition 7.1. Let x; and x; be two objects and let a,b be two agents such 
that a € [A(z;), A(a;)] and b € [A(a;), A(z;)]. The first edge fe.(x;,x;) is 
the edge computed by Lemma 7.13 for xi, xj, and a. If this edge does not 
exist, then fea(x;, £j) := L. Let {s,s +1 mod n} := fea,,)(%;,2;). The second 
edge sey(x;,x;) is the edge computed by Lemma 7.13 for x;, £j, and b after x; 
and x; have been swapped over {s,s +1 mod n}, that is, when agent s initially 
holds x; and agent s + 1 mod n initially holds object x;. If this edge does not 
exist, then sep(x;, £j) = L. 


The second lemma states that an object can never “overtake” another object 
that moves in the same direction in the cycle. 


Lemma 7.14. Let £h, £i, and x; be three objects such that A(x;) € [xn, 25]. 
Let &= (00,01,...,0s) be a sequence of swaps in which the three objects move in 
the same direction. For each p € [s], it holds that o,'(x;) € [oz + (£n), oz ' (z3)]- 


Proof. We assume that xh, xi, and x; are distinct as otherwise the state- 
ment trivially holds. Assume towards a contradiction that there is some 
minimal p € [s] such that gaz (ti) ¢ lop (£n), oz ker)]: Since p is min- 
imal, it holds that @ (m) = le; (tn), 05 -1(2,)]- We now consider the 
swap {(a, £e), (b,2a)} between o,_ı and o,. Since x. and xq are swapped in @ 
they move in different directions. Let without loss of generality x, be the object 
that moves in the same direction as x;, zj, and xp. If {£;, £j, £h} N {ze} = 90, 
then 
_ —1 =i =ï =1 =i 
Op (zj) = 054 (¢3) € [oz ee = [op 1 (€n), 052124), 

a contradiction. Hence, x. € {x2;,2;,2n}. We distinguish between the two 
cases x. = x; and x. € {£i £n} If x. = xj, then 


lop- (en), Opa (2s) = [0521 (r), al. 
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Since xi, xj and xp are distinct objects, it holds that 


ao, (xj) € lorti (a), op 21 (2s) \ {op 1 (en), 0521 (5) }- 


Thus, since each object can only move one position in each swap, it holds that 


0, (#3) € [op (2r), op-1(25)] = Lop (Er) op (es), 
which is again a contradiction. Finally, if x. € {x;,x„}, then note that rg 4 zi 
as they move in different directions. Hence, 


az (xj) = o, (a3) © [oa (4n) 05-1 (2) 


Again, since the three objects are distinct, x; held by an agent 
0,.1(25) € [opt (en), op 2a (as)] \ {op 1 (en), 0521 (@5)} 
and since in each step this interval can shrink by at most one, it follows that 


om (23) = 0,.1(23) € [op (2a), 05 (23), 


a contradiction. 


Finally, the third lemma states that there is a constant distance c such that 
for each object x; which is swapped twice with x, the distance between the two 
edges where x; is swapped with x have distance c in the input graph. 


Lemma 7.15. Let ¢ be a sequence of swaps such that agent I swaps z for x 
in the last swap of ¢. Let x; be an object that moves clockwise in & such 
that A(x;) € JA (x), A(z)]. Then, ser(x1, £) # L A ser(z,x). Let 

{51,51 + 1} := fex(x1, £), {tisti + 1} = sez(x1, £), 

{59,52 + 1} := fez(z,2), and {to, t2 + 1} := sez (z, x). 


Then, it holds that sı — tı = s2 — ta. 


Proof. This lemma almost directly follows from Lemmata 7.9 and 7.14. Assume 
towards a contradiction that there is some x; that moves clockwise and swapped 
twice with x in @ such that sı — tı Æ s2 — ta. Consider the set Y of objects xj 
that move clockwise in @ and such that A(z,) € [A(z), A(z;)] (excluding x; 
and z). By Lemma 7.9, sı — tı = |Y|. Now consider the assignment øy in d after 
the first swap of x; and x and the set Y’ of objects x; that move clockwise in @ 
with o,1(z,) € [o; '(z),o; '(z;)]- By Lemma 7.9 it holds that s3 — ta = |Y” 
and by Lemma 7.14 that Y = Y’. Thus, sı — tı = |Y| = |Y’| = s2 — ta, a 
contradiction. 
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We continue with the central definition of this subsection: the type of an 
object. The type of an object is represented by the index of the edge where the 
object can possibly be swapped with x for the first time. The idea behind types 
is the following. We will develop a 2-SAT program to determine which objects 
move clockwise in a solution. We will show that there are at most two objects 
of each type an, roughly speaking, we will introduce a variable for each type 
that represents which of the two objects of a type moves clockwise. We will use 
Lemma 7.13 to define the type of an object xj. It only remains to find an agent 
which holds each of x; and x at some point in time. If A(z,) € [I, k], then ob- 
ject x has to pass agent A(x,;) and hence we can use this agent. If A(x;) € [J, k], 
then I € [A(z;),k] and hence agent I has to first hold object x; before it can 
receive object z and hence we can use agent J in Lemma 7.13. 


Definition 7.2. The index of an edge {t — 1,t} with t € [k] is t. For each 
object x; with A(x;) € [T, k], the type of y is the index of fea(,,)(x;, £). For 
each object x; with A(z;) € [I,k], the type of y is the index of fer(x;,x). If 
the respective value is L, then the type of x; is 0. The candidate set Ca for a 
contains all objects of type a. 


Figure 7.6 shows an example of types. We continue by showing that exactly 
one object of each type moves clockwise in any solution. We use t, to denote 
the type of z. Note that x and z have to be swapped for the first time over the 
edge {tz — 1,t-}. By Lemma 7.9, for each edge 


{tz,tz +1}, {tz + 1,tz +2}, ---, {k — 1, k} 


one object x; with A(z;) € [A(z),k] has to move clockwise and be swapped 
with x over the respective edge. By Definition 7.2, these objects have types 


oy ae a ee 


and, by Observation 7.12 and Lemma 7.15, these k — t, + 1 objects are swapped 
a second time with x over the edges {0,1}, {1,2},...,{k-—tz,k—t,+1}. Hence 
for each edge {k — tz +1,k—t, +2}, {k—t,+2,k—t,+3},...,{t,-2,t,-1} 
there is an object that is swapped once with x. Since the number of such objects 
is (t,-—1)—(k—t,+1) = 2t,—k-2, there are (2t; — k —2)+(k-t,+1)=t,-1 
objects that move clockwise in total in each solution where x moves counter- 
clockwise and z moves clockwise. By definition of types, these have to have 
types a € [k — tz + 1, k]. 
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Figure 7.6: An example for types. The objects a, b, c, d, e, and x are placed next 
to the agents that initially hold them. The numbers next to edges between J and k 
depict the index of the respective edge. Objects c and e can never be swapped with x 
before x reaches agent J. Thus they both have type 0. The type of object b is 2 as 
objects b and x can only be swapped over the edge with index 2 for the first time if x 
moves counter-clockwise. The type of a and d is 1 since if x moves counter-clockwise 
and is swapped with either a or d over a different edge than {0,1}, then agent 1 holds 
the respective object before it holds x. Since agent 1 prefers either object over object x, 
it would not accept x and hence there is no solution in which a or d is swapped with x 
over a different edge than {0, 1}. 


Observation 7.16. Let d := (09,01,...,04) be a sequence of swaps which 
satisfies Assumptions 7.8 and 7.11 and such that o,(I) = a. 

For each type a € |k — t; + 1, k] there is exactly one object £a of type a that 
moves clockwise in d. All objects whose type is not in [|k—t,+1, k] move counter- 
clockwise in d. Fora € |k — tz +1,t, — 1], it holds that A(xa) € [A(2), k]. 
For a € |tz, k], it holds that A(aq) € [A(z), k]. 


Using Observation 7.16, we can now formalize selections. Selections are an 
equivalent way of think about REACHABLE OBJECT on cycles. They characterize 
which objects move clockwise and which objects move counter-clockwise. 


Definition 7.3. Let A C [k]. A set ı of objects is a selection for A if it contains 
exactly one object of each type in A and no other objects. A set + is a selection 
if it is a selection for |k — t; + 1, k]. 


We will show in Subsection 7.3.2 how to test whether a given selection leads 
to a solution, that is, a sequence of swaps such that agent I obtains object x. 
In the remainder of this subsection, we focus on eliminating possible selections. 
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Observe that if the type of an object is 0, then it cannot be swapped with x 
and hence it has to be moved counter-clockwise. We will slightly misuse the 
definition of types and relabel the type of any object x; to 0 if we know for some 
reason that x; has to move counter-clockwise. Lemma 7.15 states a first rule 
that can be used to relabel the type of an object to 0. Hence, we assume that 
each object of type a ¥ 0 fulfills the conditions of Lemma 7.15. We conclude 
this subsection with a proposition that identifies at most two “relevant” objects 
of each type and allows us to relabel the type of all other objects of this type 
to 0 (because they can not be moved clockwise in a solution). To this end, we 
define the subtypes of an object. Roughly speaking, the subtype of an object x; 
encodes whether x; is “closer” (counted in clockwise steps) to z than the other 
object that can be considered or whether x; is “further away”. We distinguish 
between objects that are possibly swapped once with x and objects that are 
possibly swapped twice with x. The main idea is that if x; is not selected (it 
moves counter-clockwise), then it has to be swapped with z. Thus we can use 
Lemma 7.13 to compute the edge where x; and z can be swapped. We can then 
check whether another object of the same type a as x; has to move clockwise in 
order for x; to reach the respective edge where x; and z can be swapped. We say 
that x; has subtype f if another object of type a between z and x; has to move 
clockwise in order for x; to reach the specified edge. Otherwise, we say that x; 
has subtype c. This characteristic is captured by the following definition. 


Definition 7.4. Let x; be an object of type a € [k — tz +1,tz —1], let x; be an 
object of type 6 € [tz +1,k], and let t; be the type of z. If A(a;) € F, A(z) —]], 
then let e := fer(2,2;) and if A(z;) € [k,n — 1], then let e = feaz,)(z, £i). 
If {a—l,a} :=eF L, then h := |[a, A(y)]| — 1 is the distance between x; and z. 
If a > h, then the subtype of x; is c (for closer) and if a < h, then the subtype 
of x; is f (for further). 

Let eı = feae,)(z,2;) and e = ser(2,2;). If {b— 1,b} =e, # L 
and {c—1,c} := e2 # L, then h := |[b, A(a;)]| — 1 is the distance between x; 
and z and let h’ := ||c,b — 1]| -1. Ifa >t, +h and h’ = t; — 2, then the 
subtype of x; is cand if a < t, + hı and h’ = t, — 2, then the subtype of z; is f. 


See Figure 7.7 for an illustration of subtypes. If e = L, then x; cannot move 
counter-clockwise and hence we can relabel the type of all other objects of type a 
to 0. Analogously, if {e1,e2}M{L} # Ø, then z; cannot move counter-clockwise 
and hence we relabel the type of all other objects of type £ to 0. 
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Figure 7.7: An example that illustrates the main idea behind Definition 7.4 and Propo- 
sition 7.18. Some objects are depicted next to the agents that initially hold them. Let 
the type of y be 6 = 6 and assume that object y moves counter-clockwise. It therefore 
has to be swapped with z at some point and since z has to pass agent A(y) to reach 
agent I, we can use Lemma 7.13 to compute the edge e := fea (y) (2, y) where y and z 
swap for the first time. Let e = {12,13} (the red edge). The distance h computed 
in Definition 7.4 is then h := |[13, A(y)]| — 1 = |[13,17]| — 1 = 4 and describes the 
number of edges that object y has to pass before it can be swapped with z. Note that 
for each type a > tz, there is an object of type a that is swapped with x twice (first 
in the violet (bottom right) region between agents tz — 1 and k and, by Lemma 7.15, 
a second time in the orange region (top right) between agents J and k — t4 +1). All 
of these objects have to be swapped with y and, by Lemma 7.14, this has to be in 
the yellow (left) region as z is swapped with y over the red edge and all other objects 
have to be swapped with y on consecutive edges. Similarly, the object that is swapped 
with y over the edge {15,16} is swapped with x over the edge {3,4}, the object that 
is swapped with y over the edge {16,17} is swapped with x over the edge {4,5}, and 
so on. Hence, the object that is swapped with y over the edge {16,17} (the first edge 
of y in counter-clockwise direction) is swapped with x over the edge {h,h+1} = {4,5} 
(green) and is therefore of type h+1=5. Since 6 > h, the subtype of y is f and y is 
not swapped with an object of type 2 that is initially held by an agent in [A (z), A(y)]- 
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Before we show the main proposition of this section, we prove a lemma that 
characterizes the types of all objects an object moving counter-clockwise is 
swapped with. 


Lemma 7.17. Let x; be an object of type a and let h be the distance between x; 
and z. If x; moves counter-clockwise, then 


e ifh<k—tz, then x; is swapped with an object of each type B € [tz,tz +h] 
before x; is swapped with z for the first time, and 


e ifh>k-—t,, then for each type B € [tz,k] U [k — tz +1, h] it holds that x; 
is swapped with an object of type B before x; is swapped with z for the first 
time. 


Proof. Let x; be an object of type a that moves counter-clockwise. We consider 
the two cases A(z;) € [I, A(z) — 1] and A(x;) € [A(z),n — 1]. 

If A(z;) € Z, A(z) — 1], then note that agent I has to hold object x; before 
it can obtain z. Hence, by Lemma 7.13, objects x; and z are swapped over the 
edge fe;(z,x,). If A(x;) € [A(z),n — 1], then note that agent A(x;) has to hold 
object z before agent J can hold object z. Hence, by Lemma 7.13, objects x; 
and z are swapped over the edge fe (z,)(z, £i). 

If the respective edge where x; and z can swap for the first time exists, then 
we denote it by {a — 1,a}. Note that the distance h between x; and z exactly 
describes the number of edges in the path between A(x;) and a that x; has to 
pass before it can be swapped with z. Note that each object with which x; is 
swapped moves clockwise. By Lemmata 7.9 and 7.15, the object that is swapped 
with x; over the edge {a,a +1} is swapped with x over the edge {t,,t, +1} and 
it therefore has type tz +1. Repeating this argument, the object that is swapped 
with x; over the edge {a + 1,a + 2} is of type t, + 2 and so on until type k is 
reached (after k — t, iterations). Thus, if h < k — tz, then x; is swapped with 
an object of each type 8 € [t.,t, +h]. If h > k— tz, then x; is swapped with 
an object of each type 6 € [t,, k] in the first k — t, iterations. The next type 
after k has then to be k — t, + 1 as the object of type k is swapped with x 
also over the edge {k — t, — 2, k —t, — 1}. Thus, if h > k — tz, then x; is also 
swapped with an object of each type 8 € [k — tz +1, h]. 


We conclude with the main result of this subsection. This allows us to relabel 
the type of all except for two objects of some type a Æ 0 to 0. If two objects of 
type a £0 remain afterwards, then they have different subtypes. 
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Proposition 7.18. Given objects x and z, there is an O(n?)-time preprocessing 
that excludes all but at most two objects of each type a > k — t- + 2 as potential 
candidates for being swapped with x. Afterwards |Ca| < 2 for alla € [k—t,+1, k]. 


Proof. Consider a type a > k — t; +2 and all objects of type a. Compute the 
subtype of each of these objects. Exactly one of them is moved clockwise and all 
others have to be swapped with z at some point. Let x; be an object of type a. 
We now consider the two cases a € [k—t,+1,t,—1] anda € [t,+1,k]. We start 
with the case where a € [k—t,+1,t,]. Note that in this case if A(x,;) € [A(z), k] 
and x; moves clockwise, then, by Lemma 7.9, x; is swapped with x before x 
and z are swapped. Hence, x; cannot be swapped with x for the first time 
over the edge {a — 1,a}. Thus, x; cannot move clockwise and its type can be 
relabeled to 0. We therefore assume that A(x;) € [A(z), k]. We consider the two 
cases A(a;) € [I, A(z) — 1] and A(z;) € [k,n — 1]. If A(a;) € [T, A(z) — 1] and z; 
moves counter-clockwise, then note that agent J has to hold object x; before it 
can obtain x. Hence, by Lemma 7.13, objects x; and z are swapped over the 
edge fe;(z,a;). If A(x;) € [k,n — 1], then note that agent A(a;) has to hold 
object z before agent J can hold object z. Hence, by Lemma 7.13, objects x; 
and z are swapped over the edge fea(z,)(z, xi). Note that if the respective edge 
does not exist (the respective value is L), then x; cannot move counter-clockwise 
and thus all other objects of type a have to move counter-clockwise and we can 
therefore relabel their type to 0. Note further that there is no solution if such 
an edge does not exist for multiple objects of the same type. 

If A(x;) € [tz +1, k] and x; moves clockwise, then note that A(z;) € [A(z), k] 
or object x; cannot be swapped twice with x before x and z are swapped 
twice. Hence, A(x;) holds z during a solution and we can use Lemma 7.13 to 
compute the edge fea(.,)(z, vi) where x; and z are swapped over for the first 
time. Again, if the respective edge does not exist (the respective value is L), 
then x; cannot move counter-clockwise and thus all other objects of type aœ have 
to move counter-clockwise and we can therefore relabel their type to 0. Moreover, 
if x; and z have to be swapped a second time over the edge e’ := sez(z, xj). 
If {b—1,b6} := e # Land {c-1,c} = e’ # L, then let h’ := |[c,b—1]|—1. Note 
that if h A t; —1, then x; and z cannot be swapped over {b—1,b} and {c— 1, c} 
as there are, by Observation 7.16, exactly t, — 1 objects that move clockwise in 
any solution. Hence, in this case x; has to move clockwise and we can relabel 
the type of all other objects to 0. 

By Lemma 7.17, object x; is swapped with an object of type a if and only 
if h > a, that is, if the subtype of x; is f. 
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We now show that there are at most two candidates for each type a. To this 
end, we iterate over all agents, starting with A(z) and iterating clockwise. If the 
object that is initially held by the agent is of type a, then we add its subtype 
to the end of an initially empty sequence. We then distinguish whether the 
sequence is sorted, that is, it is (c,c,...,¢,f, f,...,f), or not. If the sequence 
is not sorted, then for each object x; of type a, there exists an object of type a 
and subtype f that starts in [A(z), A(x;)] or an object of type a and subtype c 
that starts in [A(a;), A(z)]. Thus, if x; moves clockwise, then the number of 
counter-clockwise steps of some other object of type a does not match the 
number of swaps needed to reach the edge where the objects can be swapped 
with z. Hence, no object of type a can be moved clockwise and therefore there 
is no solution. 

Now consider the case where the objects are sorted by their subtype. By 
the same argument as above there are only two possible objects of type a that 
can possibly be moved clockwise: The “last” object of subtype c and the “first” 
object of subtype f. We can therefore set the type of all other objects of type a 
to 0. 

It remains to analyze the running time. Let na be the number of objects of 
type a. Since the subtype for each object of type a can be computed in O(n) 
time, we obtain that the described preprocessing takes O(nq-n) time for type a. 
After having computed the subtype of each object of type a, we iterate over 
all these objects and find in O(n) time the two specified objects or determine 
that the objects are not ordered by their subtype. Hence, the overall running 
time is in O( us: (Na+ n)). Note that each object (except for x) has exactly 
one type and hence +, na < n. Thus, the overall running time is bounded 


by O ast (na: n)) < O(n’). 


Proposition 7.18 shows that there are at most two objects of each type. In 
the following, we will partition types into blocks where we will observe that 
for each block there are at most two possible choices of which objects of the 
respective types to move clockwise. These choices will then be used to develop 
a 2-SAT program. Note that the proof of Proposition 7.18 also states that if 
there are two objects of some type a Æ 0, then one of them has subtype c and 
one has subtype f. Hence, we can uniquely identify any object x; which does 
not have type 0 by its type-subtype combination. For the sake of readability, we 
will denote the unique object of type a and subtype ce by a,. Analogously, $f is 
the unique object of type 8 and subtype f. If an object x; is the only object of 
some type a # 0, then we say that a. = af = Ti. 
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7.3.2 Compatibility of Solutions 


So far, we have shown how to compute a set of at most two candidates to be 
swapped with x over each edge which x has to pass. Recall that the main 
idea of our algorithm for REACHABLE OBJECT on cycles is as follows. We first 
partition the edges which x has to pass such that 


1. for each part of the partition there are at most two possible choices for 
selecting a candidate for all edges in the respective part and 


2. candidates for two different parts are either incompatible, that is, there is 
no solution for the overall problem that uses the respective candidates, or 
they can be combined independently of the choices of candidates for other 
parts. 


We then develop a 2-SAT program with a variable for each part of the described 
partition and use it to compute a set of pairwise compatible candidates for each 
edge. We next show how to partition types (these represent all edges that x 
has to pass) such that there are only two possible selections for all types in one 
part of the partition (we will call those parts blocks). Afterwards, we prove that 
selections for different blocks can be picked almost independently such that a 
set of pairwise compatible selections for each block can be computed by a 2-SAT 
program. 

Before we provide the formal definition of blocks, we first focus on objects 
of type 0 and show that no object of type 0 can initially be held by an agent 
“between” the two agents that initially hold the two objects of some type a £ 0. 
Note that Proposition 7.18 states that there are at most two objects of type a 
(one of subtype c and one of subtype f). 


Lemma 7.19. For each object x, of type O and each type a #0, if there are 
two objects x; and x; of type a # 0, then both of them are initially held by 
agents in JA(xn), A(z)]; both of them initially start in JA(z), A(zn)], or the 
type of one of these two objects can be relabeled to 0. 


Proof. Assume that there is an object xp, of type 0 such that there are two 
objects x; and x; such that A(a;) € [A (za), A(z)], A(z;) € [A(z), A(zn)], and 
both x; and x, have type «#0. Let d be the distance between A(x,) and A(z). 
By Lemma 7.17, we can compute whether x; swaps with an object of type a 
before it is swapped with z. If so, then x; cannot move clockwise and hence its 
type can be relabeled to 0. If not, then x; cannot move clockwise and hence its 
type can be relabeled to 0. 
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Figure 7.8: An example of blocks. Only a subpath of the input cycle with the objects 
initially held by the agents is depicted. The object 0 represents an object of type 0. 
The blocks in this example are {1}, {2,3,4}, and {5}. The boxes indicate all objects 
of types corresponding to each block. Note that 5. and 5; are adjacent and hence {5} 
is a block as blocks are minimal. Note that {2,3} is not a block as 4¢ is initially held 
by an agent in [A(2.), A(3,)]- Since z is the only object of type 1, it always holds 
that {1} is a block. 


We assume that Lemma 7.19 has been exhaustively applied to relabel the 
type of objects to 0. This will help us to define blocks. Intuitively, blocks are 
sets of consecutive types a,a+1,...,@ such that all objects of those types start 
on a (connected) subpath of the input graph. 


Definition 7.5. A block is a minimal subset BC [k — t, + 1, k] of types such 
that there are two agents a and b and all objects whose type is in B are initially 
held by agents in [a,b] and all objects that are initially held by agents in [a,b] 
have a type in B. 


Figure 7.8 depicts an example of blocks. Based on blocks we can state a new 
rule to relabel the type of an object to 0. 


Lemma 7.20. Let A = |a,ß] be a block and let y € [a, 8] be a type. If for 
some ô € [y +1, 6] it holds that A(6.) € J[A(z), A(ye)], then there is no solution 
in which ôe moves clockwise. If A(er) € [A(z), A(yr)] for some e € [a,y — 1], 
then there is no solution in which yf moves clockwise. 


Proof. First, assume towards a contradiction that A(d.) € [A(z), A(yc)] and 
that there is a solution in which 6. moves clockwise. Note that by Proposi- 
tion 7.18 and the definition of subtypes, it holds that A(n.) € [A(z), A(ny)] 
for each n #0. Hence, there is no object of type y that is initially held by 
an agent in [A(z), A(6.)]. Consider the solution where ĝe moves clockwise up 
to the assignment ør after the swap of ôe and x over the edge {6 — 1,6}. By 
Lemma 7.9, no object of type y is held by an agent in [o;1(z),o,1(«)]. Thus, x 
cannot be swapped over the edge {y — 1, y}, a contradiction to the assumption 
that we considered a solution. 

Second, assume towards a contradiction that A(e,) € [A(z), A(yy)] and 
there is a solution in which yf moves clockwise. Since A(e.) € [A(z), A(e,)], 


171 


all objects of type e are initially held by agents in [A(z), A(yr)]- Consider 
the solution where yf moves clockwise and the assignment after yr and x are 
swapped over the edge {y — 1, y}. By Lemma 7.9 object x was not swapped 
with any object of type e before it was swapped with yy. Hence, x cannot have 
passed the edge {e — 1, €}, a contradiction to Assumption 7.8. 


We henceforth assume that the type of objects satisfying Lemma 7.20 is 
relabeled to 0. We will show that blocks are the partitions we are looking 
for, that is, for each block A there are only two possible selections for A and 
selections for different blocks can be chosen almost independently. We start 
with a lemma that states that blocks define a partition of types. 


Lemma 7.21. Each typen > k — t, +2 is contained in exactly one block and 
all blocks can be computed in linear time. 


Proof. We first show that A(z) and A(x) divide the types into two intervals. 
Observe that all objects of type a € [tz + 1, k] have to start in [A(z), A(x)] and 
all objects of type a € [k — t, + 2,t, — 1] have to be initially hold by an agent 
in [A(z), A(z)]. Since z is the only object of type tz, the interval [t,,t,] is a 
block and no other block can contain the type tz. We now show that blocks are a 
partition of types that can be computed in linear time. We first focus on all other 
types starting with types in [tz +1,k]. Consider the object (t,+1)-. This has to 
be the initially “closest” non-type-0 object to agent A(z). If (tz +1). = (tz+1),, 
then [tz + 1,t, + 1] is a block. Otherwise, we know by Lemma 7.20 that the 
next object (in clockwise steps) has to be either (t +2). or (tz +1). If the 
object £; is found, where £ is the largest type that is so far considered in the 
block, then the block [t+ + 1, 4] is found. Notice that £ < k and if a block is 
found, then we can redo the whole process starting with the object (4 + 1). 
until the block [¢’, k] is found for some ¢’. Starting then from agent k, we can 
search for object (k — tz + 2). and repeat the whole argumentation until a 
block [@’,t, — 1] is found. At this point, each type is contained in exactly one 
block. Since we need only a constant amount of computation time for each 
object, all blocks can be computed in linear time. 


In order to prove that there are only two possible selections for each block 
that can lead to a solution, we first show an intermediate lemma. 


Lemma 7.22. Let A = [a,ß] be a block and let ı1 be a selection for A. 
Let y € A be a type. If yp € ua, then df € t4 for each 6 € |y, p] or t4 cannot 
be part of a selection that corresponds to a solution in which x reaches I. 
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Proof. We prove the statement by induction on y. Note that if y = 6, then the 
statement trivially holds. Now assume that y < 8, yf € ea, and if (y+l)s € tA, 
then 6 € ta for each ô € [y+ 1, 8]. By Lemma 7.20, it holds that (y + 1). is 
initially hold by an agent in [A (ae), A(yr)] and (y+ 1). cannot move clockwise 
since yf moves clockwise and is swapped with x over the edge {y — 1, y}. This 
holds true as if (y + 1). moved clockwise, then it holds by Lemma 7.9 that x 
is swapped with yf before it is swapped with (y+ 1)-. Hence, x cannot be 
swapped with (y+ 1). over the edge {y, y + 1} which is the type of (y+ 1). 
Thus, object (y + 1). moves counter-clockwise and (y+ 1); moves clockwise. 
By induction hypothesis, 6; € ¿4 for each ô € [7, 8]. 


Based on Lemmata 7.20 and 7.22, we can now prove that there are only two 
possible selections for each block that can lead to a solution. 


Lemma 7.23. Let A = |a,ß] be a block. There are at most two selections 
for A that can be part of a selection that corresponds to a solution in which x 
reaches I. These selections can be computed in O(n - |A|) time. 


Proof. We will construct two selections 1; and tg for A that can be part of 
a selection that corresponds to a solution in which x reaches J. We start 
with a, € 4; and ay € t2. Note that by Lemma 7.22 t2 = {ay, (at 1)f¢,..., Bf} 
Thus it remains to show that 11 is unique. 

If &. moves clockwise, then ay has to move counter-clockwise. Using 
Lemma 7.17, we can compute the number h of objects x; that are initially held 
by A(z;) € [A(a.), Alay)], that have types a+1,a+2,...,a+h, and that have 
to move clockwise. We now switch to an arbitrary type y as we will use the state- 
ment iteratively (starting with y = a). If y = ß, then 4 = {ac, (a+])c,..., Be}. 
We therefore assume that y < ß, that ye moves clockwise, and that h objects x; 
initially held by A(z;) € [A(Y.), A(yp)] and of types y+ 1,y+2,...,y +h 
move clockwise. We consider the two cases h = 0 and h > 0. If h = 0, then 
note that, by Lemma 7.20, A((y+1).) € [A (7°), A(yr)]- Hence, (y+ 1). moves 
counter-clockwise and (y+ 1); moves clockwise. By Lemma 7.22, it holds for 
each 6 € [y + 1, 8] that ö; € v4 and thus 


t1 = {Qe, (a + Do... Y (¥ +1) 9, (y+ 2)7,---, Bg he 


If h > 0, then we will show that (y +1). moves clockwise and hence we can 
repeat the argument. Assume towards a contradiction that (y + 1). moves 
counter-clockwise. Then (y+1) moves clockwise and, by Lemma 7.22, so does 6 


173 


for each 6 € [y+ 1, 8]. Hence, no object 6. moves clockwise for ô € [y + 1, 8]. 
Note that by definition of blocks it holds for each object x; which is initially held 
by A(z;) € [A(Yc.), A(yr)] that z; = 6. for some ô € [y, 8] or that x; = ny for 
some 7 € [a, 7]. If nf moved clockwise for some 7 € [a, 7], then, by Lemma 7.22, 
object yr also moved clockwise, a contradiction. Thus, if (y+ 1)e moves counter- 
clockwise, then no object x; initially held by A(z;) € [A(Y.), A(yr)] moves 
clockwise. Thus, ye and yf cannot be swapped and x cannot reach J. 

It remains to analyze the running time. Computing t2 takes O(n) time. 
Computing 4; takes O(n) time for each ye with y € [a, 6], that is, O(n- |A|) time 
in total. 


It is finally time to explain how to check whether a selection leads to a 
solution, that is, a sequence of swaps such that agent I obtains object x. Note 
that once a selection ı is fixed, Observation 7.12 states which objects are 
swapped how often with x or z. We assume that no object moves after it is 
swapped with x or z for the final time as these swaps are not necessary for x 
reaching J. Thus, we can compute the final position of each object and also 
the path P,, of agents that hold each object x; during a solution corresponding 
to ı. Gourves et al. [GLW17] observed that once the path Py, of each object x; 
is fixed, then the order in which objects are swapped is irrelevant as long as 
all objects “follow” their respective paths. Thus, there is a unique set of edges 
where two objects x; and x; swap in each solution in which the objects in ı 
move clockwise and all other objects move counter-clockwise. We denote this 
set by ez, «+ An example of Cr,,c, and Pr, is given in Figure 7.9. It remains 
to show how to compute e;,,„, and how to find a selection where each pair of 
objects can be swapped at the respective edge. To this end, we show how to 
compute ez, x; from partial selections, that is, from selections for some subset A 
of types. 


Lemma 7.24. Let x; be an object of type a € |k — t; + 1,k], let x; be an 
object of type a € |k —t, +1,k], and let xp, be an object of type O. Let A 
and B be two blocks with a € A and B E€ B. Given a selection ıa for A, the 
set ek, ,, 18 the same for each selection ı 2 14 and can be computed in O(n) 
time. Given selections i4 for A and tg for B, the set eriz; 18 the same for each 


selection ı D L4 Utg and can be computed in O(n) time. 


Proof. We start with determining which pairs of objects are not swapped, which 
pairs are swapped once and which are swapped twice. Let x, be an object 
moving clockwise and let x, be an object moving counter-clockwise. We consider 


174 


N 
z T , d b 
O_O 


T zZ 


Figure 7.9: An example of REACHABLE OBJECT on cycles. The objects initially held by 
agent are depicted outside each vertex. If objects z and c move clockwise (u = {c, z} is 
the selection) and all other objects move counter-clockwise, then objects a and z swap 
over the edge {4,5} as z swaps with objects x and d over the edges {2,3} and {3,4}, 
respectively, and object a swaps with c over the edge {0,5}. The path P, for this 
solution is (4,5, 0, 1,2) as cis initially held by agent 4, moves clockwise, and is swapped 
with x over the edge {1,2}. 


the two cases A(x,) € [A(z),k] and A(z,) € [A(z), k]. If A(z) € [A(2), k] 
and A(z4) € [A(z,),k], then x, and x, are, by Lemma 7.9, swapped once 
before x and z are swapped for the first time and once afterwards. Thus, 
by Corollary 7.10, they are swapped exactly twice. If A(x) € [A(z), 4] 
and A(z,) € [A(z,),k], then x, and x, are swapped once as z and x, are, 
by Observation 7.12, swapped exactly once and we assume that no object moves 
after it swapped with x or z for the final time. 
If A(x,) € [A(z), k], then we distinguish between the three cases 


A(aq) € A(z), k], Alz,) € [k +1, A(z,)], and A(a,) € [A (zp), A(2)]- 


In the first case, x, is swapped twice with z and since, by Lemma 7.9, it is not 
swapped with x, before it is swapped with z for the first time, it is swapped 
with £p once. In the second case, x, is swapped once with z and since, by 
Lemma 7.9, it is not swapped with x, before it is swapped with z for the first 
time, it is not swapped with xp. In the third case, x, is swapped once with z 
and, by Lemma 7.9, it is swapped with x, before it is swapped with z. Thus, in 
this case x, and x, are swapped once. 

We now show how to compute the set of edges where two objects can be 
swapped. Note that it is enough to compute the first edge where two objects can 
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be swapped as if two objects x, and x, are swapped twice, then the object moving 
counter-clockwise is, by Lemma 7.9, swapped with all objects moving clockwise 
before x, and x, are swapped again and there are always t, objects moving 
clockwise. By definition of blocks and by Lemma 7.20, each object £a with a type 
in A = [7,6] is initially held by agent A (za) € [A(Y.), A(ôf)]. By Lemma 7.19, 
it holds for each type u € [k — tz + 1, k] that A(x,) € [A (u°), A(uf)] and thus 
there is some type p’ such that A(xp) € [A (u4), A((u + 1)c)]. We now compute 
the first edge where x; and x, can be swapped if x; moves clockwise. Note 
that if x; moves counter-clockwise, then it is not swapped with xp. If a < py’, 
then x, is swapped once with an object of each type in fa, w] \ {a} before it is 
swapped with x;. Hence {A(x,) — |a — w|, A(an) - |a — w| +1} € ek, x, is the 
first edge where x; and x, can be swapped. If a > p’, then zp is first swapped 
once with an object of each type w’, —1,...,k — t: +1,k,k—-1,...,a+1 
before it is swapped with «;. Note that these are t; — |{i | W < i < a}| objects 
and therefore in this case {A (xp) —t, +a—p’—1,A(z,)-tz+a—pw} ee, a 
is the first edge where x; and x, can be swapped. 

It remains to analyze the possible edges for x; and xj. We assume without 
loss of generality that x; moves clockwise and x; moves counter-clockwise, 
that is, ©; € ıa and zj ¢ 4g. We distinguish between the two cases A = B 
and A Æ B. If A= B, then t4 = tg and the direction of each object initially 
held by agents in [A(y-), A(ö,)] is known. Since the number of objects moving 
clockwise is constant (and equal to t+), the number c of objects moving clockwise 
in [A(z;), A(z;)] is known and the first edge where x; and x; can be swapped 
is {A(x;)—c—1, A(x,;)—c}. If AF B, then let B := [w, x]. Since vz is given, the 
number of objects in [A(%.), A(x;)] moving clockwise is known. We can then 
use the same argument as for zp, where W = yY — 1 (or w = kif Yy = k- t, +1). 
Note that computing the unique set ez, s; (or eg, z) takes O(n) time as we 
only compute the type of certain objects and the number of objects of certain 
types moving clockwise. 


An example of the set of edges computed in Lemma 7.24 is given in Figure 7.10. 
A selection ı leads to a solution if and only if for each pair (x;, xj) of objects 
such that x; € and x; ¢ ı and each edge e € €,,.,, the agents incident to e can 
agree on swapping x; and xj. Hence, to check for a given selection ı whether it 
leads to a solution, we iterate over all pairs (x;, £j) of objects such that x; € ı 


and x; ¢ ı and distinguish between the following three cases. 


e Either z; does not have a type in [k — t; + 1,k], that is, x; = x or the 
type of x; is 0, 
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Figure 7.10: An example to illustrate Lemma 7.24. Depicted is a set of agents that are 
arranged in a cycle. Some objects are depicted next to the vertices of the agents which 
initially hold them. Notice that A = [2,3,4] is a block and let v4 := {2.,37,4r} bea 
selection for A. We show how to compute the edges e5, 4, for each selection ı D va. 
Since objects 3. and 2; move counter-clockwise (they are not contained in ı 4), the two 
objects 2. and 4. next to them have to be swapped over edge {5,6}. Afterwards, by 
Lemma 7.9, object 4. is swapped with each other object moving clockwise before it is 
swapped with 2. for a second time. Since t; — 1 = 6 objects move clockwise, object 4e 
moves over five edges after the first swap before it is swapped with 2. for a second 
time. Thus, 2. and 4. are swapped for a second time over the edge {0,19}. 


e the types of x; and x; are in the same block, or 
e the types of x; and x; are in different blocks. 


The first two cases give rise to the notion of consistent selections. A selection t4 
for a block A is consistent if each object in ¿4 can be swapped with each object x; 
that has type 0 or a type in A over the respective edges in e € Crp fi Therein ı 
is any selection that generalizes ı 4, that is, ı D v4. For the sake of readability, 


we use Ca = U aea Ca to denote the set of all objects of a type in A. 
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Definition 7.6. Let A be a block and let ¿4 be a selection for A. Let x; € tA 
be an object. Then, ¿4 is consistent if 


e for any object xj of type 0 or zj € C4 \ ¿4 and 


L 
C5 25 


e for any edge e € e for any selection ı D tA 


the agents incident to e can agree on swapping x; and zj. 


We call a selection inconsistent if it is not consistent. Note that a given 
selection ¿4 for a block A can be checked for consistency in O(n? - |A|) time by 
iterating over all x; € ¿4 and all x; € X, computing the set e}, „, in O(n) time, 
and checking whether they can agree on the swap in constant time using the 
preprocessed pos-values. 

The third requirement (checking whether two objects of types in different 
blocks can be swapped) gives rise to the notion of compatible selections. We say 
that two selections ¿4 and vg for blocks A and B are compatible, if all pairs of 
objects of types in A and B, respectively, can be swapped at their respective 
edges. 


Definition 7.7. Let A = [a, 8] and B = [y, ô] be two blocks. Let ¿4 and ıB 
be two selections for A and B, respectively. The selections are compatible if 


e for all z; € ¿4 and all x; € Cg \ cB, 
e for all z; € C4 \ va and all x; € vp, 


and each edge e € e;,, s; for some ı 2 4 ULB, the agents incident to e can agree 
on swapping x; and zj. 


We say that two selections ¿4 and ¿g are incompatible if they are not compat- 
ible. Observe that given two selections ¿4 and vg for blocks A and B, we can 
check them for compatibility in O(|A|-|B|-n) time by iterating over all x; € tA 
and all x; € Cg \ tg and compute the respective set of edges in O(n) time 
using Lemma 7.24. Afterwards, we iterate over all v; € C4 \ ¿a and all z; € up 
and compute the respective set of edges. Checking whether the two agents 
incident to each of the at most two edges can agree on swapping x; and x; takes 
constant time as the pos-values are precomputed. It remains to find consistent 
selections for each block that are pairwise compatible. We solve this problem 
using 2-SAT programming. Therein, we have a variable for each block. The 
truth value of each variable represents which selection for the respective block 
is chosen and the clauses guarantee that no two incompatible selections are 
chosen. 
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Theorem 7.25. REACHABLE OBJECT on cycles can be solved in O(n*) time. 


Proof. We start the proof with an overview over the algorithm and an analysis 
of its running time. Afterwards, we show that the algorithm is correct. For each 
possible choice of object z (there are at most n many possibilities) and each of 
the two possible direction we assume x to move (two possibilities) we do the 
following. First, compute the type and subtype of each object. Second, compute 
all blocks and use Lemma 7.23 to compute two possible selections v),, 17, for each 
block A in overall O(n) time. Afterwards, check each of these selections for 
consistency in overall O(n?) time and discard all inconsistent selections. Next, 
compute for each pair of consistent selections for different blocks whether they 
are compatible. This takes overall O(n?) time. Finally, check whether there are 
pairwise compatible selections for each block using the 2-SAT program below 
and return true if so. 

Before we present the 2-SAT program, we first show a small preprocessing 
step. If for some block A there is only one consistent selection ¿4 for A, then we 
discard all selections that are not compatible with ¿4 as there is no set of pairwise 
compatible and consistent selections for all blocks that do not contain v4. Since 
all remaining selections are compatible with ı A, we can ignore ¿4 from now on. 
If this rule discards any selection, then the respective other selection for this 
block is the only consistent selection for this block and hence, we repeat the 
process. After at most n rounds, each of which only takes O(n) time, we arrive 
at a situation where there are exactly two consistent selections for each block 
and the task is to find a set of pairwise compatible selections that include a 
selection for each block. We finally reduce this problem to a 2-SAT formula. 

We start with a variable vg for each block B which is set to true if we select 1, 
and set to false if we select 1*,. For each pair (t4, tg) of incompatible selections 
for different blocks A and B do the following. For the sake of simplicity, we use u 
and w to denote the literals representing the selections for blocks A and B that 
are incompatible. Formally, if v4 = 24, then u := va and otherwise u := = va. 
Analogously, if ig = 1, then w := vg and otherwise w := — vg. Since we 
cannot select ¿4 and tp at the same time (the formula cannot satisfy u and w 
at the same time), we add the clause (~ u V ~ w) to our 2-SAT formula. 

Observe that if there is a set of pairwise compatible selections for each block, 
then the 2-SAT formula is satisfied by the corresponding truth assignment of the 
variables. Conversely, if the formula is satisfiable, then the selections for each 
block corresponding to a satisfying truth assignment specify a direction for each 
object. Any sequence of swaps that follows these directions will eventually lead 
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to agent I obtaining object x. Since 2-SAT can be solved in linear time [APT79] 
and the constructed formula has O(n?) clauses of constant size, the overall 
running time for each possible choice of z and the direction of x is in O(n). 
Since there are O(n) possible choices for combinations of z and the direction 
of x, the overall running time for all iterations is in O(n*). 


7.4 Concluding Remarks 


We investigated the computational complexity of REACHABLE OBJECT with 
respect to restrictions on the maximum degree of the input graph and the 
maximum length of preference lists. Our work narrows the gap between known 
tractable and intractable cases leading to a more comprehensive understanding 
of the computational complexity of REACHABLE OBJECT. In particular, we 
showed a dichotomy result regarding the length of the preference lists of the 
agents and showed polynomial-time solvability for REACHABLE OBJECT on 
graphs with maximum degree at most two (note that a graph of maximum 
degree two is the disjoint union of paths and cycles and Huang and Xiao [HX20] 
resolved the case of paths). Saffidine and Wilczynski [SW18, Theorem 4] showed 
NP-hardness of REACHABLE OBJECT on graphs of maximum degree at most 
four. Hence, the computational complexity of REACHABLE OBJECT on graphs 
of maximum degree three remains the only open case towards a dichotomy result 
with respect to the parameter maximum degree. We conjecture that this case 
is NP-hard. Other interesting question regarding the maximum degree of a 
graph are whether REACHABLE OBJECT is polynomial-time solvable on trees 
if the maximum degree is some constant and whether our running-time bound 
of O(n*) for graphs of maximum degree two is tight, that is, can it be improved 
to e. g. O(n? logn) or is there some (e.g. ETH-based) lower bound? 

Note that in a cycle each object can take one of two paths towards its target 
object and these two paths translate to assigning each variable in our 2-SAT 
program one of the two possible truth values true or false. Jansen [Jan17] used 
a variant of 2-SAT where each variable can have one of N values (where N 
is some constant) to show containment in P for a variant of HITTING SET. 
It would be interesting to see whether there are graph classes in which each 
object can take one of constantly many paths to its target object where this 
generalization of 2-SAT can be used to show polynomial-time solvability. 

Regarding modifications and generalizations of REACHABLE OBJECT, note 
that in the HOUSING MARKET problem the agents cannot only swap in pairs but 
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also in trading cycles. Trading cycles quite naturally translate into hyperedges 
in the input graph of REACHABLE OBJECT. A set of agents can swap their 
currently held objects along a trading cycle only if they share a hyperedge. This 
generalization of REACHABLE OBJECT seems to be a quite natural link between 
HOUSING MARKET and REACHABLE OBJECT and has not been studied so far. 
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Chapter 8 


Outlook 


In this thesis, we shed some light on the computational complexity of special cases 
of different graph problems using mostly dynamic and 2-SAT programming. We 
conclude the thesis with a summary of our main results and a broader reflection 
on what we observed and how our work can be continued. Since we already 
provided directions for further research related to the problems studied in the 
respective chapters, we will focus on dynamic and 2-SAT programming here. 

We start with a summary of our main results. For DIAMETER, we presented 
results within the field of FPT in P. In particular, using dynamic programming, 
we showed that DIAMETER is solvable in O((n + m) - h- (d” + h@)) time when 
parameterized by the h-index and diameter d. We further presented an O(n?-m)- 
time algorithm for LENGTH- BOUNDED CUT on proper interval graphs and proved 
that k-DISJOINT SHORTEST PATHS is solvable in nO(*+)) time. 

Using 2-SAT programming, we showed that SOFT TREE CONTAINMENT is 
solvable in O(n?) time when the input network is a 2-labeled phylogenetic tree. 
Complementing this result, we showed that SOFT TREE CONTAINMENT remains 
NP-hard when restricted to binary 3-labeled phylogenetic trees. Finally, we 
showed how to solve REACHABLE OBJECT in O(n?) time on cycles and proved a 
dichotomy result on arbitrary graphs parameterized by the length of the longest 
preference list of an agent. If all preference lists are of length at most three, 
then the problem can be solved in linear time and for lists of length at most 
four it remains NP-hard. 

We continue with some concluding thoughts on dynamic and 2-SAT pro- 
gramming. Concerning dynamic programming, note that the three problems 
we studied in the first part of this thesis were all related to shortest paths 
in graphs. All three respective dynamic programs used the length of solution 
paths to some extent in the representation of a table entry. For DIAMETER, 


183 


one of the dimensions of the respective table measures the length of a longest 
shortest path in the input graph. For LENGTH-BOUNDED CUT, each table entry 
represents the minimum number of edges to delete in order to increase the length 
of a shortest path between the terminal s and each vertex in a given subset of 
vertices including t to some given threshold. Finally, for k-DISJOINT SHORTEST 
PATHS, each table entry represents whether there are disjoint paths between 
terminal pairs in a directed acyclic graph. We iterate over a topological order 
of this graph and hence allow for longer and longer disjoint paths. Concluding, 
we observed the following heuristic for how to use dynamic programming: If 
the problem is about paths in graphs, then try to design a dynamic program that 
iteratively allows for longer and longer paths. 

Finally, we reflect on 2-SAT programming and answer the question we started 
this thesis with: When is 2-SAT programming a promising tool for solving 
algorithmic problems? Let us begin with revisiting how we and other authors 
used 2-SAT programming. All 2-SAT programs (including the ones from the 
literature) had in common that they, to some extent, compute a solution consist- 
ing of a set of pairwise compatible elements. In this thesis, these elements were 
either canonical vertices (in Chapter 6) or selections for blocks (in Chapter 7). 
Notice that in both cases exactly one out of at most two alternatives was chosen 
in the solution. The same holds true for all 2-SAT programs that we could find 
in the literature with one exception. Jansen [Jan17] used a version of 2-SAT 
where each variable can have one of N possible truth values in [N] (where N is 
some constant) and a literal expresses that the truth value of a certain variable 
is at least or at most some given threshold. This version of 2-SAT is known to 
be polynomial-time solvable [BHM00]. Combining these insights, we present 
two heuristics of when to try applying 2-SAT programming to a new problem. 


1. A (polynomial-time) reduction from the considered problem to SATISFIA- 
BILITY or 3-SAT is known or easy to achieve. Observe in what special 
cases the reduction yields 2-SAT formulas. 


2. The considered problem is (thought to be) polynomial-time solvable and 
has some independence structure, that is, a solution consists of some 
elements that can 


e be partitioned into constant-size parts and at most one element from 
each part is picked into the solution and 


e a set of elements forms a solution if each pair of elements in this set 
can be contained in the same solution. 
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With these heuristics at hand, we remark that we could not find any application 
of 2-SAT programming in the context of scheduling. This is surprising as schedul- 
ing is fundamentally about finding pairwise non-conflicting assignments of jobs 
to machines and time slots. We therefore conjecture that 2-SAT programming 
should be applicable in the context of scheduling quite often. 

The importance of 2-SAT programming is not limited to exact polynomial- 
time algorithms either. 2-SAT programming has already been used in an 
approximation algorithm [WW95] and we believe that it might have further 
applications, for instance in heuristics or data reductions. It might even be 
useful to reduce some NP-hard problem to an exponential number of 2-SAT 
formulas (or one formula of exponential size) to achieve faster exponential-time 
algorithms. 

Finally, there are other special cases of SATISFIABILITY that are polynomial- 
time solvable. Most notably, XOR-SAT (clauses consist of exclusive-or oper- 
ations and clauses are connected by and operations) is also linear-time solv- 
able [Sch78]. We could only find a single reference ([Rad+07]) where a problem 
was reduced to XOR-SAT but there it was not used for a polynomial-time 
algorithm. We believe that XOR-SAT programming is also worth investigating 
as a potential tool for exact polynomial-time algorithms and we believe it to be 
most useful for problems in which the parity of numbers is important. 
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